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Chimeric Histone Acetyltransferase Polypeptides 

TECHNICAL FIELD 

This invention relates to methods and materials for analyzing and modulating 
gene expression. In particular, the invention features chimeric histone acetyltransferase 
6 polypeptides that can be used to determine gene expression profiles in specific cells, and 
to modulate gene expression in specific cells. 

BACKGROUND 

Genes often are diflferentially expressed during the development of an organism, 
and in particular cells in an organism. Understanding and manipulating an organism's 
10 temporal and spatial gene expression profile can be usefiil for developing new and 

improved biological products and therapies. Among the array of regulatory mechanisms 
that affect the gene expression profile of an organism, chromatin remodeling has an 
important role. 

Eukaryotic DNA is tightiy packaged into chromatin. The most basic element of 
1 5 DNA packaging is the nucleosome, which consists of an octamer of histone proteins 
wrapped by about 146 nucleotide base pairs. The compaction of eukaryotic DNA into 
nucleosomes and the formation of nucleosome arrays present natural barriers to genetic 
regulatory proteins, and to enzymes that interact with DNA. Chromatin-associated 
protein complexes reportedly can, among other things, stabiUze and destabilize 
20 nucleosomal DNA and thereby affect nuclear processes that use DNA as a substrate (e.g., 
transcription, replication, DNA repair, and DNA organization) as well as regulators of 
these processes. 

Some chromatm-associated protein complexes are reported to use the energy of 
ATP hydrolysis to increase histone mobility, and to thereby change the accessibility of 
25 certain nucleosomal DNA to enzymes that process genetic information and to genetic 
regulatory proteins. It is thought that ATP-dependent chromatin-remodeling protein 
complexes can have a role in both gene activation and repression. Researchers have 
reported the existence of ATP-dependent chromatin-remodeling protein complexes in 
organisms mcluding yeast (e.g., SWI/SNF, RSC, ISWl, ISW2, and Ino 80), Drosophila 
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(e.g., dSWI/SNF, ACF, CHRAC, andNURF), and human (e.g., hSWI/SNF, NuRD, RSF, 
andACF). 

Other chromatin-associated protein complexes are reported to change chromatin 
structure by covalently modifying histones (e.g., by adding or removing acetyl, methyl, 

6 phosphate or ubiqmtin). It is thought that by covalently modifying histones, these protein 
complexes can affect chromatin structure and thereby change the accessibility of 
nucleosomal DNA to enzymes that process genetic information and to genetic regulatory 
proteins. Some of these histone-modifying protein complexes also are thought to affect 
the activity of ATP-dependent chromatin-remodeling complexes. 

10 For example, some histone-modifying chromatin-associated protein complexes 

reportedly contain a polypeptide subunit having histone acetyltransferase ("HAP') 
enzymatic activity. Such protein complexes are, in general, thought to have a role in 
activating transcription. Researchers have reported the existence of polypeptides having 
HAT enzymatic activity m organisms including yeast, Tetrahymena, and humans. 

1 6 As another example, some histone-modifying chromatin-associated protein 

complexes reportedly contain a polypeptide subunit having histone deacetylase 
("HDAC'O enzymatic activity. Such protein complexes are, in general, thought to have a 
role in repressing transcription. Researchers have reported the existence of polypeptides 
having HDAC enzymatic activity in organisms including yeast, C elegans, Drosophila, 

20 Xenopus, chicken, moxise, human and maize. 

SUMMARY 

The present invention relates to chimeric histone acetyltransferase ("HAT") 
polypeptides useful for determining gene expression profiles in specific cell types, or for 
modulating gene expression in specific cell types. For example, chimeric HAT 
26 polypeptides can be used to affect gene expression to achieve desirable results, such as 
enhancing expression of specific genes in a eukaryotic organism. Chimeric HAT 
polypeptides contain a polypeptide segment that has HAT enzymatic activity and a 
polypeptide segment that is similar or identical to a subxmit a of chromatin-associated 
protein complex having histone deacetylase ("HDAC") enzymatic activity. 
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Thus, the invention features chimeric polypeptides that contain: 1) a first 
polypeptide segment that exhibits histone acetyltransferase activity, and 2) a second 
polypeptide segment having 40% or greater (e.g., at least 40%, at least 60%, at least 80% 
and at least 90%) sequence identity to a subunit of a histone deacetylase chromatin- 
5 associated protehi complex (e.g., a subxmit that exhibits scaffold activity, a subunit that 
exhibits DNA binding activity, a subunit fliat exhibits ATPase-dependent helicase 
activity, and a subunit that exhibits histone deacetylase activity). The first and second 
polypeptide segments are arranged such that a terminus of the second polypeptide 
segment is linked to a terminus of the first polypeptide segment via at least one covalent 
10 bond. 

In some embodiments, the first and second polypeptide segments can be directly 
linked via a peptide bond. In such embodiments the C-terminal amino acid of the first 
polypeptide segment can be Knked to the N-terminal amino acid of the second 
polypeptide segment. Alternatively, the N-terminal amino acid of the first polypeptide 

16 segment can be linked to the C-terminal amino acid of the second polypeptide segment. 
In some embodiments, the first and second polypeptide segments can be indirectly linked 
via one or more (e.g., 1 to 50, and 10 to 50) intervening amino acids that are situated 
between the first and second polypeptides. In such embodiments, the C-terminal amino 
acid of the first polypeptide segment can be linked to an intervening amino acid, and the 

20 N-terminal amino acid of the second polypeptide segment can be linked to an intervening 
amino acid. Alternatively, the N-terminal amino acid of the first polypeptide segment can 
be linked to an intervening amino acid, and the C-terminal amino acid of the second 
polypeptide segment can be linked to an intervening amino acid. In some embodiments, 
the intervening amino acids include at least one alanine residue and / or at least one 

25 glycine residue. 

The invention also features nucleic acid constructs that encode such chuneric 
polypeptides, and eukaryotic organisms that include such chimeric polypeptides. 

The invention also features eukaryotic organisms that contain a nucleic acid that 
encodes a chimeric polypeptide having: 1) a first polypeptide segment that exhibits 

30 histone acetyltransferase activity; and 2) a second polypeptide segment that has 40% or 
greater sequence identity to a subimit of a histone deacetylase chromatin-associated 
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protein complex. The first and second polypeptide segments of the encoded chimeric 
polypeptide are arranged such that a terminus of the second polypeptide segment is 
covalently linked to a terminus of the jBrst polypeptide segment. The nucleic acid can be 
operably linked to a promoter. 
5 The iuvention also features eukaryotic organisms that contain: 1) a first nucleic 

acid construct having a first promoter and a transcription activator element operably 
linked to a coding sequence that encodes a chimeric polypeptide, and 2) a second nucleic 
acid construct having a second promoter conferring cell type-specific transcription 
operably linked to a coding sequence for a polypeptide that binds the transcription 

10 activator element. The encoded chimeric polypeptide has: 1) a first polypeptide segment 
that exhibits histone acetyltransferase activity, and 2) a second polypeptide segment that 
has 40% or greater sequence identity to a subunit of a histone deacetylase chromatin- 
associated protein complex. The first and second polypeptide segments of an encoded 
chimeric polypeptide are arranged such that a terminus of the second polypeptide segment 

1 5 is covalently linked to a terminus of the jBrst polypeptide segment In some embodiments, 
the organism is an animal. In other embodiments the organism is a plant (e.g., a monocot 
such as com and rice, or a dicot such as soybean and rape). In some embodiments, the 
plant contains a mutation or agent that alters (i.e., increases or decreases) the DNA 
methylation state in the plant relative to a corresponding plant that lacks said agent or 

20 mutation. In some embodiments, the mutation is in a CS DNA methyltransferase (a.k.a. 
cytosine C5 DNA methyltransferase) gene. In some embodiments, the agent is an 
antisense nucleic acid. In some embodiments, the agent affects expression of a C5 DNA 
methyltransferase gene. 

The invention also features methods for detecting the expression of one or more 

25 genes in a eukaryote. The methods involve isolating macromolecules from one or more 
specific cells in a eukaryote (e.g., a plant or an animal) that contains a nucleic acid 
construct in which a promoter is operably linked to a coding sequence that encodes a 
chimeric polypeptide, and then determining the presence or amount of at least one of the 
macromolecules in at least one of the specific cells. The encoded chimeric polypeptide 

30 has: 1) a first polypeptide segment that exhibits histone acetyltransferase activity, and 2) 
a second polypeptide segment that has 40% or greater sequence identity to a subunit of a 
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histone deacetylase chromatin-associated protein complex. The first and second 
polypeptide segments of the encoded chimeric polypeptide are arranged such that a 
terminus of the second polypeptide segment is covalentiy linked to a terminus of the first 
polypeptide segment In some embodiments, the macromolecules are polypeptides. In 

5 some embodiments, tiie macromolecules are nucleic acids. In some embodiments, the 
promoter confers cell-type specific transcription in a plant reproductive tissue (e.g., ovule, 
central cell, endosperm, embryo, and zygote). In some embodiments, the promoter 
confers cell-type specific transcription in a plant vegetative tissue. 

In some embodiments, the eukaryote also contains a second nucleic acid 

10 construct. In such embodiments, the first nucleic acid construct has a recognition site for 
a transcriptional activator operably linked to the promoter and the coding sequence. The 
second nucleic acid construct has a second promoter conferring cell-type specific 
transcription that is operably linked to a coding sequence for a polypeptide that binds the 
recognition site for the transcriptional activator. 

15 The invention also features methods for modulating gene expression in a 

eukaryote. The methods involve making a eukaryote (e.g., a plant or an animal) having a 
nucleic acid construct in which a cell-type specific promoter is operably linked to a 
coding sequence that encodes a chimeric polypeptide. The encoded chimeric polypeptide 
has: 1) a first polypeptide segment that exhibits histone acetyltransferase activity, and 2) 

20 a second polypeptide segment that has 40% or greater sequence identity to a subunit of a 
histone deacetylase chromatin-associated protein complex. The first and second 
polypeptide segments of the encoded chimeric polypeptide are arranged such that a 
terminus of the second polypeptide segment is covalentiy linked to a terminus of the first 
polypeptide segment The eukaryote exhibits modulated gene expression in cells in 

25 which the promoter confers cell-type specific transcription. In some embodiments, the 
eukaryote has compositional alterations relative to a corresponding organism that lacks 
said nucleic acid construct. In some embodiments, the eukaryote has developmental 
alterations relative to a corresponding organism that lacks said nucleic acid construct. In 
some embodiments, the eukaryote has phenotypic alterations relative to a corresponding 

30 organism that lacks said nucleic acid construct 
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In some embodiments, the organism is a plant. In some embodiments, the 
promoter confers cell-type specific transcription in a plant reproductive tissue (e.g., ovule, 
central cell, endosperm, embryo, and zygote). In some embodiments, the promoter 
confers cell-type specific transcription in a plant vegetative tissue. In some embodiments, 
5 the plant contains a mutation or agent that alters (e.g., increases or decreases) the DNA 
methylation state in the plant relative to a corresponding plant that lacks said agent or 
mutation. In some embodiments, the mutation is in a C5 DNA methyltransferase gene. 
In some embodiments, the agent is an antisense nucleic acid. In some embodiments, the 
agent affects ejcpression of a C5 DNA methyltransferase gene. In some embodiments, 

10 modulated gene expression alters seed development. In some embodiments modulated 
gene expression alters embryo development. In some embodiments, modulated gene . 
expression alters endosperm development In some embodiments, modulated gene 
expression alters seed yield by mass. 

The invention also features methods for modulating gene expression in a 

15 eukaryote that involve making a eukaryote (e.g., a plant or an animal) that has 1) a &st 
nucleic acid construct having a first promoter and a transcription activator element 
operably linked to a coding sequence that encodes a chimeric polypeptide, and 2) a 
second nucleic acid construct having a second promoter conferring cell type-specific 
transcription operably linked to a coding sequence for a polypeptide that binds the 

20 transcription activator element. Tlie encoded chimeric polypeptide has: 1) a first 
polypeptide segment that exhibits histone acetyltransferase activity, and 2) a second 
polypeptide segment that has 40% or greater sequence identity to a subunit of a histone 
deacetylase chromatin-associated protem complex. The first and second polypeptide 
segments of the encoded chimeric polypeptide are arranged such that a terminus of the 

25 second polypeptide segment is covalently linked to a terminus of the first polypeptide 
segment. The eukaryote exhibits modulated gene expression in cells in which the second 
promoter confers cell-type specific transcription. In some embodiments, the eukaryote 
has compositional alterations relative to a corresponding organism that lacks said nucleic 
acid construct. In some embodiments, the eukaryote has developmental alterations 

30 relative to a corresponding organism that lacks said nucleic acid construct. In some 
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embodiments. Hie eukaryote has phenotypic altemtions relative to a corresponding 
organism that lacks said nucleic acid construct 

In some embodiments, the organism is a plant. In some embodiments, the second 
promoter confers cell-type specific transcription in a plant reproductive tissue (e.g., ovule, 

6 central cell, endosperm, embryo, and zygote). In some embodiments, the second 
promoter confers cell-type specific transcription in a plant vegetative tissue. In some 
embodiments, the plant contains a mutation or agent that alters (e.g., increases or 
decreases) the DNA methylation state in the plant relative to a corresponding plant that 
lacks said agent or mutation. In some embodiments, the mutation is in a C5 DNA 

10 methyltransferase gene. In some embodiments, the agent is an antisense nucleic acid. In 
some embodiments, the agent affects expression of a C5 DNA methyltransferase gene. In 
some embodiments, modulated gene expression alters seed development. In some 
embodiments modulated gene expression alters embryo development. In some 
embodiments, modulated gene expression alters endosperm development In some 

15 embodiments, modulated gene expression alters seed yield by mass. 

The invention also features methods for making a genetically modified eukaryote. 
The methods uivolve making a first eukaryote (e.g., a plant or an animal) that has a first 
nucleic acid construct having a first promoter and a transcription activator element 
operably linked to a coding sequence. The coding sequence encodes a first polypeptide 

20 segment and a second polypeptide segment The first polypeptide segment exhibits 

histone acetyltransferase activity, and the second polypeptide segment has 40% or greater 
sequence substantially identical to a subunit of a histone deacetylase chromatin-associated 
protein complex. The first and second polypeptide segments of the encoded chimeric 
polypeptide are arranged such that a temunus of the second polypeptide segment is 

25 covalently linked to a terminus of the first polypeptide segment The methods also 
involve making a second eukaryote that has a second nucleic acid constract having a 
promoter that confers embryo-specific transcription operably linked to a coding sequence 
encoding a polypeptide that binds the transcription activator element of the first nucleic 
acid construct The method also involves crossing the first and second eukaryotes to form 

30 genetically modified progeny that are sterile. 
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Unless otherwise defined, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs. Although methods and materials similar or equivalent to those 
described herein can be used to practice the invention, suitable methods and materials are 
5 described below. All publications, patent appHcations, patents, and other references 
mentioned herein are incorporated by reference in their entirety. In case of conflict, the 
present specification, including definitions, will control. In addition, the materials, 
methods, and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from Ihe 
1 0 following detailed description. 

DETAILED DESCRIPTION 
This invention features chimeric histone acetyltransferase ("HAT") polypeptides. 
Chimeric HAT polypeptides can be used to determine and modulate gene expression 
profiles in eukaryotic organisms. 

15 

Chimeric polypeptides 

A chimeric HAT polypeptide contains at least two polypeptide segments: a first 
polypeptide segment that exhibits HAT enzymatic activity, and a second polypeptide 
segment that is substantially identical to a subunit of those chromatm-associated protem 
20 complexes having histone deacetyltransferase ("HDAC") activity. A chimeric HAT 
polypeptide typically is not found in nature. 

First polvpeotide segment 

A polypeptide segment that exhibits HAT enzymatic activity is a suitable first 
25 polypeptide segment of a chimeric HAT polypeptide. Whether a first polypeptide 
segment exhibits HAT enzymatic activity can be deterauned by testing either the 
polypeptide segment or the chimeric HAT polypeptide using an assay that measures the 
transfer of an acetyl fimctional group from an acetyl donor such as acetyl CoA to a 
histone polypeptide or polypeptide segment. See e.g., Brownell, J. and Allis, CD, (1995) 
30 Proc, Natl Acad Set 92, 6364-6368. This assay can be used to screen candidate 
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polypeptide segments for HAT enzymatic activity, and to test chimeric polypeptides for 
HAT erusymatic activity. 

In some embodiments, a first polypeptide segment has an amino acid sequence 
that corresponds to the amino acid sequence of one of the following polypeptides: yeast 

5 Esal, Gcn5, Sas3, yTAFIIIlSO, ELP3, HATl or Hpa2; Drosophila dGcnS, dTAFII230 or 
MOF; Tetrahymena p55; or human hGcnS, p300/CPB, PCAF, Tip60, hTAFUZSO, 
TFIII90/1 10/220, SRC-1 or ACTR. In other embodiments, a first polypeptide segment 
can have an amino acid sequence with substitutions, insertions or deletions relative to one 
of the above-mentioned polypeptides. Any polypeptide segment having HAT enzymatic 

10 activity is suitable as a first polypeptide segment, irrespective of the number or character 
of amino acid insertions, deletions, or substitutions. Thus, in some embodiments, the 
amino acid sequence of a first polypeptide segment corresponds to less than the fiiU- 
length sequence (e.g., a HAT fimctional domain) of one of the above-mentioned 
polypeptides. 

15 One of skill will recognize that individual substitutions, deletions or additions to a 

polypeptide that alter, add or delete a suigle amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 
Conservative substitution tables providing functionally similar amino acids are well 

20 known in the art. The following six groups each contain amino acids that are 
conservative substitutions for one another: 

1) Alanine (A), Serine (S), Threonine (T); 

2) Aspartic acid (D), Glutamic acid (E); 

3) Asparagine (N), Glutamme (Q); 
26 4) Argtnine (R), Lysine (K); 

5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and 

6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). 
(see e.g., Creighton, Proteins (1984)). 

Other suitable candidates for first polypeptide segments can be identified by 
30 homologous polypeptide sequence analysis. A similar analysis can be applied to identify 
suitable candidates for second polypeptide segments. HAT amino acid sequence families 
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are known to be conserved. For example, plant histone acetyltransferase genes can be 
identified by BLAST or PSI-BLAST analysis of nonredundant protein databases using 
known plant, yeast and/or animal histone acetyltransferase amino acid sequences. 
Homologous polypeptide sequence analysis involves the identification of conserved 

5 regions in a template polypeptide, also referred to herein as a subject polypeptide. 

Conserved regions can be identified by locating a region within the primary amino acid 
sequence of a template polypeptide that is a repeated sequence, forms some secondary 
structure such as helices, beta sheets, etc., establishes positively or negatively charged 
domains and represents a protein motif or domain. See e.g., Bouckaert et al., U.S. Sen 

10 No. 60/121,700, filed February 25, 1999, and the Pfam web site describing consensus 
sequences for a variety of protein motifs and domains at http://www.sanger.ac.ulc/Pfain/ 
and http://genome.wustl.edu/Pfam/. The information included in the Pfam database is 
described in Sonnhammer et al., Nucl Acids Res 26:320-322 (1998), and in Sonnhammer 
et al.. Proteins 28:405-420 (1997); Bateman et al., Nucl Acids Res 27:260-262 (1999), 

15 and Sonnhammer et al.. Proteins 28:405-20 (1997). From the Pfam database, consensus 
sequences of protem motifs and domains can be aligned with the template polypeptide 
sequence to determine conserved region(s). 

Conserved regions also can be determined by aligning sequences of the same or 
related polypeptides from closely related plant species. Closely related plant species 

20 preferably are firom the same family. Alternatively, alignment are performed using 
sequences fix)m plant species that are all monocots or are all dicots. In some 
embodiments, alignment of sequences firom two different plant species is adequate. For 
example, sequences firom canola and Arabidopsis can be used to identify one or more 
conserved regions. Such related polypeptides firom different plant species need not 

25 exhibit an extremely high sequence identity to aid in determining conserved regions. For 
example, polypeptides that exhibit about 35% sequence identity can be usefiil to identify 
a conserved region. Typically, conserved regions of related proteins exhibit at least 40% 
sequence identity; or at least about 50%; or at least 60%, or at least 70%, at least 80%, or 
at least 90% sequence identity. In some embodiments, a conserved region of target and 

30 template polypeptides exhibit at least 92, 94, 96, 98, or 99% sequence identity. Sequence 
identity can be either at the amino acid or nucleotide level. 
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In some embodiments, a first polypeptide segment is the polypeptide encoded by 
the maize HAClOl, HAC104, HAC105, HAC107 or HAC109 gene or a homolog thereof. 
The maize HAClOl gene belongs to the CREB-Binding Protein family of transcriptional 
co-activators with histone acetyltransferase activity. Maize HAC104 is most homologous 

5 to the GCN5 fannily of HATs in yeast and animals. Maize HAC 1 05 is most homologous 
to the ESAl related family of HATs in yeast and anunals. Maize HAC107 is most 
homologous to the ELP3 related family of HATs in yeast and animals. Maize HAC109 is 
most homologous to the HATl related family of HATs in yeast and animals. In other 
embodiments, polypeptides having modifications relative to the above polypeptides are 

10 suitable first polypeptide segments. 

In some embodiments, a first polypeptide segment is the polypeptide encoded by 
the Arabidopsis HACl, HAC2, HAC3, HAC4, HAC7 or HAC8 gene or a homolog 
thereof Arabidopsis HAC2 and HAC4 genes encode HATs that are homologous to 
human CREB-binding proteins. Arabidopsis HAC3 is homologous to yeast Gcn5. 

15 Arabidopsis HACl is homologous to yeast HATl . In other embodiments, polypeptides 
having modifications relative to the above polypeptides are suitable first polypeptide 
segments. 

Exemplary amino acid sequences of HAT polypeptides are shown in Table 6. 
Yet other first polypeptide segments can be synthesized on the basis of consensus 
20 HAT fimctional domains. See e.g.. Table 13. 

Second polypeptide segment 

Chimeric polypeptides of the invention have a second polypeptide segment that is 
covalently linked to the first polypeptide segment. A second polypeptide segment can 

25 have substantial identity, or can be identical, to a subunit of certain chromatin-associated 
protein ("CAP") complexes, i.e., those CAP complexes having a subunit that exhibits 
histone deacetylase activity ("CAP/HDAC complexes"). CAP/HDAC complexes 
include, for example, polycomb group (PcG) complexes, SIN3/HDAC-containing 
complexes, Mad-Max complexes, Tupl-Ssn6 complexes, DNMTl complexes, MeCPl 

30 and MeCP2 complexes, MBD complexes, and Ikaros-Aiolos-containing complexes. 
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Amino acid sequences of subunits of CAP/HDAC complexes generally are conserved 
among different species. 

CAP/HDAC complexes can be distinguished from other chromatin-associated 
protein complexes by the presence of a subunit that exhibits histone deacetylase activity. 
5 Alternatively, CAP/HDAC complexes can be distinguished from other cliromatin- 
associated protein complexes by the presence of a subunit that exhibits sequence 
homology to known histone deacetylase proteins. In contrast, other chromatin-associated 
protein complexes either have histone acetyltransferase activity or have neither HAT nor 
HDAC activity. CAP/HDAC complexes also can be distinguished from other chromatin- 

10 associated protein complexes by their effect, in vitro or in vivo, on gene expression. 
Transcription from genes in nucleosomes to winch CAP/HDAC complexes are boimd 
typically is reduced or even eliminated. In contrast, chromatin-associated protein 
complexes having a HAT subunit typically facilitate increased transcription from genes in 
nucleosomes to which such complexes are bound CAP/HDAC complexes can be 

15 distmguished from transcription complexes by the lack of any subunit that interacts 
directly with RNA polymerase U. CAP/HDAC complexes can be readily distinguished 
from nucleosomes because CAP/HDAC complexes do not have histones as subunits of 
the complex. 

Whether a particular complex possesses a subunit that exhibits HDAC activity can 
20 be determined by testing a putative CAP/HDAC complex or its subunits, for HDAC 
activity. HDAC activity can be determined by an assay that measures the removal of an 
acetyl group from a Kistone polypeptide or histone polypeptide segment. See e.g., van der 
Vlag, J. and Otte, A.P. Nature Genetics 25, 474-478 (1999). This assay can be used to 
screen subunits of candidate CAP complexes for HDAC activity. Alternatively, a CAP 
25 complex can be shown to possess a subunit having HDAC activity by sequence identity to 
a subunit of a known CAP/HDAC complex, as described herein. 

Once a CAP complex has been determined to possess a histone deacetylase as one 
subunit of the complex, then all subunits of that particular CAP/HDAC complex can be 
tested for their suitability as a second polypeptide segment. Polypeptides can be 
30 identified as subxmits of a CAP/HDAC complex by their co-purification with the 
complex. 
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In some embodiments, the second polypeptide segment is the subunit that is 
HDAC itself. Such subunits can be identified using the above-described assay for HDAC 
en2ymatic activity. The foUowmg polypeptides having HDAC enzymatic activity have 
been identified: yeast RPD3, HDAl, HOSl, H0S2, and H0S3; C elegans HDAl, 

5 HDA2, HDA3; Drosophila dHDACl, dHDAC2, dHDACS, and dHDA2; Xenopus HDm; 
chicken HDACl, HDAC2, and HDAC3; mouse HDACl, HDAC2, HDAC3, mHDAl, 
and mHDA2; human HDACl, HDAC2, HDAC3, HDAC4, HDAC5, HDAC6, HDAC7, 
and HDAC8; and maize RPD3 and HD2. See e.g., Cress W.D. and Seto, E. (2000) J, 
Cell Physiol 184, 1-16, All of the above HDAC polypeptides are suitable for use as tiie 

10 second polypeptide segment, as are homologous polypeptides and recombinant 

polypeptides (i.e., polypeptides having amino acid insertions, deletions, or substitutions) 
having greater than 40% sequence identity. 

Subunits of CAP/HDAC complexes also can be identified by 
coimmunoprecipitation using antibodies against known CAP/HDAC subunits. 

15 Purification of CAP/HDAC subunits using coimmunoprecipitation has been described, 
for example, in: Jones P.L. et al. Nature Genet 19, 187-191 (1998); van der Vlag, J. and 
Otte, A.P. Nature Genetics 25, 474-478 (1999); Wade, PA. et al. Nature Genetics 23, 62- 
66 (1999); Ng, H.H. et al. Nature Genetics 23, 58-61 (1999); and SpiUane C. et al. Curr 
Biol 10, 1535-1538 (2000). 

20 Subunits of CAP/HDAC complexes also can be identified by yeast two-hybrid 

analyses using hybrid polypeptides contaming known CAP/HDAC subunits. Use of the 
yeast two-hybrid system to identify CAP/HDAC subunits has been described, for 
example, in: Yadegari. R. et al. Plant Cell 12, 2367-2381 (2000); and SpiUane C. et al. 
Curr Biol 10, 1535-1538 (2000). 

25 In some instances, suitable second polypeptide segments can be synthesized on 

the basis of consensus fimctional domains and/or cons^ed regions in polypeptides that 
are homologous subunits of a CAP/HDAC complex. Consensus domains and conserved 
regions can be identified by homologous polypeptide sequence analysis as described 
herein. The suitability of such synthetic polypeptides for use as a second polypeptide 

30 segment can be evaluated by the techniques described herein, or by evaluating the ability 
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of a synthetic polypeptide to effectively substitute for a corresponding subunit when 
expressed in a eukaryotic organism. 

Many CAP/HDAC complexes and CAP/HDAC complex subunits are known to be 
conserved in plants, fimgi and animals. Subunits of a CAP/HDAC complex in one 
5 organism can be used to identify homologous subunits in another organism, e.g., 

homologs of a subimit of a known CAP/HDAC complex can be identified by performing 
a BLAST query on a database of protein sequences. Those proteins in the database that 
have greater than 40% sequence identity are candidates for further evaluation for 
suitabiUty as a second polypeptide segment. For example, the Arabidopsis polycomb 

10 group proteins FIE and MEA have significant sequence identity to the Drosophila 
proteins extra sex combs and enhancer of zeste. If desired, manual inspection of such 
candidates can be carried out in order to narrow the number of candidates for further 
evaluation. Manual inspection is carried out by selecting those candidates that appear to 
have domains suspected of being present in subunits of CAP/HDAC complexes. 

15 Further evaluation can be carried out by creating a chimeric polypeptide having 

the candidate as the second segment, inserting the chimeric polypeptide into a eukaryotic 
organism, and evaluating the phenotypic effect of the chimeric polypeptide in the 
organism. If the desired phenotypic eflfect(s) is observed, the candidate is suitable as a 
second polypeptide segment 

20 A percent identity for any subject nucleic acid or amino acid sequence (e.g., any 

of the chimeric polypeptide first polypeptide segments, or second polypeptide segments 
described herein) relative to another 'targef nucleic acid or amino acid sequence can be 
determined as follows. First, a target nucleic acid or amino acid sequence of the 
invention can be compared and aligned to a subject nucleic acid or amino acid sequence 

25 using the BLAST 2 Sequences (B12seq) program fi:om the stand-alone version of 
BLASTZ containing BLASTN and BLASTP (e.g., version 2.0.14). The stand-alone 
version of BLASTZ can be obtained at <www.fi:.com> or <www.ncbi.nlm.nih.gov>. 
histructions explaining how to use BLASTZ, and specifically the B12seq program, can be 
found in the 'readme' file accompanying BLASTZ. The programs also are described in 

30 detail by Karlm et al. {Proc. Natl Acad, Set USA, 87:2264 (1990) and 90:5873 (1993)) 
and Altschul et al. (NucL Acids Res,, 25:3389 (1997)). 
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B12seq perfonns a comparison between a subject sequence and a target sequence 
using either the BLASTN (used to compare nucleic acid sequences) or BLASTP (used to 
compare amino acid sequences) algorithm. Typically, the default parameters of a 
BLOSUM62 scoring matrix, gap existence cost of 1 1 and extension cost of 1, a word size 
of 3, an expect value of 1 0, a per residue cost of 1 and a lambda ratio of 0.85 are used 
when performing amino acid sequence alignments. The output file contains aligned 
regions of homology between the target sequence and the subject sequence. Once 
aligned, a length is determined by counting the number of consecutive nucleotides or 
amino acid residues (/.e., excluding gaps) from the target sequence that align with 
sequence from the subject sequence starting with any matched position and endmg with 
any other matched position. A matched position is any position where an identical 
nucleotide or amino acid residue is present in both the target and subject sequence. Gaps 
of one or more residues can be inserted into a target or subject sequence to maximize 
sequence alignments between structurally conserved domains. 

The percent identity over a particular length is determined by counting the number 
of matched positions over that particular length, dividing that number by the length and 
multiplying the resulting value by 100, For example, if (z) a 500 amino acid target 
sequence is compared to a subject amino acid sequence, (if) the B12seq program presents 
200 amino acids from the target sequence aligned with a region of the subject sequence 
where the first and last amino acids of that 200 amino acid region are matches, and (Hi) 
the number of matches over those 200 aligned ammo acids is 180, then the 500 amino 
acid target sequence contains a length of 200 and a sequence identity over that length of 
90% (/.e., 180 4- 200 x 100 = 90). In some embodiments, the amino acid sequence of a 
second polypeptide segment has 40% sequence identity to the amino acid sequence of a 
subunit of a CAP/HDAC complex. In some embodiments, the amino acid sequence of a 
second polypeptide segment has greater than 40% sequence identity (e.g., > 80%, > 70%, 
> 60%, > 50% or > 40%) to the amino acid sequence of a subunit of a CAP/HDAC 
complex. 

It will be appreciated that a nucleic acid or amino acid target sequence that aligns 
with a subject sequence can result in many different lengths with each length having its 
own percent identity. It is noted that the percent identity value can be rounded to the 
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nearest tenth. For example, 78.11, 78.12, 78.13, and 78.14 is roxmded down to 78.1, 
while 78,15, 78.16, 78.17, 78.18, and 78.19 is rounded up to 78.2. It is also noted that the 
length value will always be an integer. 

A partial list of nucleic acids encoding proteins that are subunits of CAP/HDAC 
5 complexes is shown in Table 1 . The nucleic acids shown in Table 1 encode proteins that 
are subunits of CAP/HDAC complexes often referred to as polycomb group (PcG) 
complexes. Such proteins are candidates to be the second polypeptide segment. 



Table 1. Polycomb Group Subunits 



Genes 


GI number-Source 


Additional sex combs (Asx) 


01:3292939 


Cramped 


GI:5869804 


Enhancer of Zeste (E(z)) 


GI:404864 


Enhancer of polycomb 


GI:3757890 




GI'2133657 or GM 050997 




GT-9989052 


At Fnl9 


GT-3 152596 


ZmEplOl 


GI:20152912 


Multi sex combs (mxc) 


GI:6746602 


Pleiohomeotic (pho) 


01:3258627 


Polycomb (Pc) 


01:129718 


Polycomb-like (Pel) 


01:521181 


Polyhomeotic distal (mouse) 


01:1490546 


Polyhomeotic proximal (php) 


01:730323 


Posterior sexcombs (Psc) 


01:548613 or GI:103177 


Sexcomb extra (See) 


sequence unknown 


Sex comb on midleg (Scm) 


01:1293574 


Suppressor-2 of zeste 


01:236137 (partial) 


Supressor of zeste 12 Su(z)12 


01:8131946 


Su(z)2fD) 


sequence unknown 


Super sex combs (sxc) 


sequence unknown 


AtFis2 


01:4185501 


AtEmf2 


01:14276050 


AtVm2 


01:16945788 


At MEA; At CLF; At E(Z)-likeAl; Mezl; 
Mez2; Mez3 


01:3089625 


At Fie 


01:4567095 


Zm Fiel 


01:18032004 


ZmFie2 


01:18032006 
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In some embodiments, a second polypeptide segment is the polypeptide encoded 
by the Arabidopsis Mea, FIS2, FEE, At E(Z)-likeAl, curly-leaf, or TSOl-like genes or 
homologs thereof. Polypeptides having modifications relative to these polypeptides also 
can be suitable second polypeptide segments. 

5 Also usefiil are proteins that are subimits of SIN3/HDAC complexes, including, 

for example, Sin3, Rpd3 RbAp48, RbAp46, NcoR and SMRT. See e.g., Wolfife, A.P. et 
al., Mol Cell Biol, 19:5847-5860 (1999). A partial list of nucleic acids encoding proteins 
that are subunits of SIN3/HDAC complexes is shown in Table 2. Polypeptides having 
modifications relative to these polypeptides also are suitable second polypeptide 

10 segments. 
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Table 2. Subunits of SinS/HDAC Complexes 



Genes 


GI number 


SinS 


GI:9624449 


STBl (Sin3 binding protein) 


GI:988311 


STB2 (Sin3 binding protein) 


GI:988309 


Rpd3 


01:417699 


SDS3 CsuDoressor of defective silencing 3) 


01:1480732 


HD2A 


01:7489751 


HD2B 


01:7716948 


HDACl 


01:2498443 ■ 


HDAC2 


01:3023939 


RbAp48 


01:3309245 


RbAp46 


01:4506439 


SMRT 


01:2136312 


Tupl 


01:83454 


Ume6 


01:6320413 


N-CoRl and 2 (nuclear receptor corepressor) 


01:5454138; 01:12643957 


Ssn6 


01:117936 


Madl 


01:1708908 


Mnt 


01:6754718 


Mxil 


01:1709194 


Rox 


01:3914034 


PSF fDolvDVrimidine tract-binding protein-associated splicing factor) 


01:10442545 


NonO/p54(nrb) 


01:13124797 


Ikaros 


01:3915731 


Aiolos 


01:2150044 


MBDl 


01:7305259 


MBD2 


01:5929756 


MBD3 


01:4505119 


MBD4 


GI:6754652 


MeCPl (PCMl) 


01:7710141 


MeCP2 


01:1708973 


Mi-2 


01:4557451 


SAP18 


01:11433775; 5032067 


SAP30 


01:11436724; 4506783 


MTA-like 


01:6754644 


KRAB-ZFP (Kruppel associated box ) 


01:9625008 



Also useful are proteins that are subunits of Mad-Max complexes, another group 
of CAP/HDAC complexes. Examples of Mad-Max complex subunits include Max-Mad- 
Mxi-Myc (basic HLH), mSin3a/B, HDACl/2, N-CoR (nuclear receptor corepressor), and 
SMRT (silencing mediator of retinoic acid and thyroid hormone receptor). Also useful 
are proteins that are subujoits of Tupl-Ssn6 complexes. Examples of Tupl-Ssii6 complex 
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subunits include Ume6, Tupl, Ssn6, Migl, a2 or Crtl, and HDAC class I complexes 
(Rpd3, Hosl, Hos2). See e.g., Watson A.D. Genes & Dev., 14:2737-2744, (2000). Other 
suitable subunits can include Sin4, Srb8, SrblO, Srbl 1, and Med3. In other embodiments, 
polypeptides having modifications relative to the above polypeptides are suitable second 
5 polypeptide segments. 

Exemplary nucleotide and/or amino acid sequences of CAP/HDAC subunit genes 
and/or polypeptides are shown in Table 7. 

Arrangement of polvpentide segments 

10 Segments of a chimeric HAT polypeptide are linked to one another by covalent 

bonds, typically peptide bonds. The segments can be linked directly, without any 
intervening amino acids between two segments. Altematively, one segment can be linked 
indirectly to an adjacent segment by amino acid residues that are situated between the two 
adjacent segments and are themselves covalently linked to the adjacent segments. In 

lj5 some embodiments, there are one, two, three, four, five, six, seven, eight, nine or ten 
intervening amino acid residues. In other embodiments, there are fifteen, twenty, thirty, 
forty or fifty intervening residues. In some embodiments, an intervening segment can be 
a hinge domain. Typically, if there is an intervening segment, at least one of the amino 
acids in the intervening segment is a glycine. At least one glycine is preferred in order to 

20 promote structural flexibility of the spacer, and permit free rotation of the first 
polypeptide segment relative to the second polypeptide segment. An illustrative 
embodiment of an intervening segment is one having fifteen glycine residues positioned 
between the first polypeptide segment and the second polypeptide segment and covalently 
linked to each by a peptide bond. 

25 An intervening peptide segment can be situated between the segments of a 

chimeric polypeptide of the invention in order to facilitate interaction between the histone 
in a nucleosome and the HAT of the chimeric polypeptide. Structural modeling can be 
used to predict whether an intervening peptide segment is useful in a chimeric HAT 
polypeptide. Structural modeling can be performed using software such as Rasmol 2.6, 

30 available from the UC Berkeley website http://mc2.CChem.Berkelev.EDU/Rasmol/v2.6/ . 
For example, the theoretical distance between the first polypeptide segment of a chimeric 
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polypeptide and the surface of a nucleosome is modeled, based on the crystal structure of 
a nucleosome (histones H2A, H2B, H3 and H4, and a 147 nucleotide DNA), the crystal 
structure of the DNA binding domain of a TATA binding protein and the crystal structure 
of a Tetrahymena histone acetyltransferase GCN5 homologue, including the coenzyme 

5 Acetyi-CoA and the 1 1-mer N-terminal tail of histone H3. The TATA binding protein is 
modeled as it is situated on the DNA of the nucleosome. The HAT is modeled while 
adjacent to the tail of histone H3. Next, the distance from the closest surface of HAT to 
the nucleosome surface is calculated. Based on this example, an intervening peptide 
segment of at least 28 A in length facilitates interaction between the HAT and histone yet 

10 maintains nucleosome interaction and histone modification. Twenty eight A is 
approximately the same length as a peptide containing 15 amino acids. Stmctural 
flexibility of the intervening peptide segment can be enhanced by using at least one 
glycine amino acid and/or at least one alanine amino acid. 

The first polypeptide segment of a chimeric polypeptide can be the N-terminal 

1 5 segment of a chimeric polypeptide of the invention. In such embodiments, the C- 

terminus of the first polypeptide segment can be covalently linked to the N-terminus of 
the second polypeptide segment, or can be covalently Imked to tiie N-terminus of an 
intervening peptide segment, which can be schematically indicated at l^L2"^ or l^*-i-2"**, 
where "1^^" indicates the first polypeptide segment, *'2"*'" indicates the second polypeptide 

20 segment and ' Y' indicates an optional intervening peptide segment 

In other embodiments, the first polypeptide segment can be the C-terminal 
segment of a chimeric polypeptide of the invention. In such embodiments, the C- 
terminus of the second polypeptide segment is covalently linked to the N-terminus of the 
first polypeptide segment, or can be covalently linked to the N-terminus of an intervening 

25 peptide segment, which can be schematically indicated as 2"*^-l^^ or 2^^-i-l^'. 

A chimeric polypeptide of the invention optionally can possess additional amino 
acid residues at the amino-terminus or the carboxy-terminus. For example, 6x His-tag or 
FLAG® residues can be linked to a polypeptide at the amino-terminus. See e.g., U.S. 
Patent Nos. 4,851,341 and 5,001,912. As another example, a reporter polypeptide such as 

30 green fluorescent protein (GFP) can be fiised to the carboxy-terminus of the chimeric 
polypeptide. See e.g., U.S. Patent No. 5,491,084. 
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With respect to polypeptides, "isolated" refers to a polypeptide that constitutes a 
major component in a mixture of components, e.g., 30% or more, 40% or more, 50% or 
more, 60% or more, 70% or more, 80% or more, 90% or more, or 95% or more by 
weight. Isolated polypeptides typically are obtained by purification from an organism 
5 that makes the polypeptide, although chemical synthesis is also feasible. Methods of 
polypeptide purification include, for example, chromatography or immxmoafEinity 
techniques. 

The amino acid sequence of either or both polypeptide segments of a chimeric 
HAT polypeptide can be a non-naturally occurring amino acid sequence. For example, 

10 the amino acid sequence of one polypeptide segment can be a naturally occurring 
sequence found in a particular species, while the amino acid sequence of the other 
polypeptide segment is a non-naturally occurring consensus amino acid sequence based 
on the naturally occurring sequences of homologs from different species. 

A polypeptide of the invention can be detected by sodium dodecyl sulphate 

15 (SDS)-polyacrylamide gel electrophoresis followed by Coomassie Blue-staining or 
Western blot analysis using monoclonal or polyclonal antibodies that have binding 
affinity for the polypeptide to be detected. 

Nucleic Acids Encoding a Chimeric Polypeptide 

20 The present invention also includes nucleic acids encoding the above-described 

chimeric polypeptides. As used herein, nucleic acid refers to RNA or DNA, including 
cDNA, synthetic DNA or genomic DNA. The nucleic acids can be single- or double- 
stranded, and if single-stranded, can be either the coding or non-coding strand. As used 
herein with respect to nucleic acids, "isolated" refers to (i) a naturally-occurring nucleic 

25 acid encoding part or all of a polypeptide of the mvention, but free of sequences, i.e., 
coding sequences, that normally flank one or both sides of the nucleic acid encoding 
polypeptide in a genome; (ii) a nucleic acid incorporated into a vector or into the genomic 
DNA of an organism such that the resulting molecule is not identical to any naturally- 
occurring vector or genomic DNA; or (iii) a cDNA, a genomic nucleic acid fragment, a 

30 fragment produced by polymerase chain reaction (PCR) or a restriction fragment. 
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Specifically excluded firom this definition are nucleic acids present in mixtures of nucleic 
acid molecules or cells. 

It should be appreciated that nucleic acids havmg a nucleotide sequence other than 
the specific nucleotide sequences disclosed herein can still encode a polypeptide having 
5 the exemplified ammo acid sequence. The degeneracy of the genetic code is well known 
to those of ordinary skill in the art; i.e., for many amino acids, there is more than one 
nucleotide triplet that serves as the codon for the amino acid. 

Nucleic acid constructs 

10 Further provided are nucleic acid constructs comprising the above-described 

nucleic acid coding sequences. Such constructs can comprise a cloning vector. Cloning 
vectors suitable for use in the present invention are commercially available and are used 
routinely by those of ordinary skill in the art. 

Nucleic acid constructs also can contain sequences encoding other polypeptides. 

15 Such polypeptides can, for example, facilitate the introduction or maintenance of the 
nucleic acid construct into a host organism. Other polypeptides also can affect the 
expression, activity, or biochemical or physiological effect of the encoded CBF 
polypeptide. Alternatively, other polypeptide coding sequences can be provided on 
separate nucleic acid constructs. 

20 Nucleic acid constructs of the invention can comprise one or more regulatory 

elements operably linked to a nucleic acid coding sequence. Such regulatory elements 
can include promoter sequences, enhancer sequences, response elements or inducible 
elements that modulate expression of a nucleic acid sequence. As used herein, "operably 
linked" refers to positioning of a regulatory element in a construct relative to a nucleic 

25 acid coding sequence in such a way as to permit or facilitate expression of the encoded 
polypeptide. The choice of element(s) that can be included depends upon several factors, 
including, but not limited to, replication efficiency, selectability, inducibility, desired 
expression level, and cell or tissue specificity. 

Suitable regulatory elements include promoters that initiate transcription only, or 

30 predominantiy, ia certain cell types. For example, promoters specific to vegetative 

tissues such as groxmd meristem, vascular bundle, cambium, phloem, cortex, shoot apical 
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meristem, lateral shoot meristem, root apical meristem, lateral root meristem, leaf 
primordium, leaf mesophyll, or leaf epidermis can be suitable regulatory elements* In 
other embodiments, a promoter specific to a reproductive tissue (e.g., fruit, ovule, seed, 
pollen, pistils, female gametophyte, egg cell, central cell, nucellus, suspensor, synergid 
cell, flowers, embryonic tissue, embryo, zygote, endosperm, integument, seed coat or 
pollen) is used. A cell type or tissue-specific promoter can drive expression of operably 
linked sequences in tissues other than the target tissue. Thus, as used herein a cell type or 
tissue-specific promoter is one that drives expression preferentially in the target tissue, 
but can also lead to some expression in other cell types or tissues as well. Methods for 
identifying and characterizing promoter regions in plant genomic DNA include, for 
example, those described in the following references: Jordano, et al., Plant Cell, 1 :855- 
866 (1989); Bustos, et al.. Plant Cell, 1:839-854 (1989); Green, et al., EMBOJ., 7:4035- 
4044 (1988); Meier, et al.. Plant Cell, 3:309-316 (1991); and Zhang, et al., Plant Physio., 
110:1069-1079(1996). 

Exemplary reproductive tissue promoters include those derived firom the 
following seed-genes: zygote and embryo LECl (see, Lotan (1998) Cell 93:1195-1205); 
suspensor G564 (see, Weterings. (2001) Plant Cell 13:2409-2425); maize MACl (see, 
Sheridan (1996) Genetics, 142:1009-1020); maize Cat3 (see, GenBankNo. L05934; 
Abler (1993) Plant Mol Biol, 22:10131-1038); Arabidopsis viviparous-l (see, Genbank 
No. \J93215); Arabidopsis atmycl (see, Urao (1996) Plant Mol Biol, 32:571-57; 
Conceicao (1994) Plant, 5:493-505); and Brassica napus napin gene family, including 
napA (see, GenBankNo. J02798; Josefeson (1987) JBL 26:12196-1301; Sjodahl (1995) 
Planta, 197:264-271). Other exemplary reproductive tissue-specific promoters include 
those derived from the pollen genes described in, for example: Guerrero (1990) Mol Gen. 
Genet., 224:161-168; Wakeley (1998) Plant Mol Biol, 37:187-192; Ficker (1998) Mol 
Gen. Genet, 257:132-142; Kulikauskas (1997) Plant Mol Biol, 34:809-814; and Treacy 
(1997) Plant Mol Biol, 34:603-61 1, Yet other suitable reproductive tissue promoters 
include those derived from the following embryo genes: Brassica napus 2s storage 
protein (see, Dasgupta (1993) Gene, 133:301-302); Arabidopsis 2s storage protein (see, 
GenBankNo. AL161566); soybean p-conglycmin (see, GenBankNo. S44893); Brassica 
napus oleosin 20kD gene (see, GenBankNo. M63985); soybean oleosin A (see, Genbank 
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No. U09118); soybean oleosin B (see, GenBankNo. U091 19); soybean lectinl (see, 
GenBank K00821); soybean Kunitz trypsin inhibitor 3 (see, GenBank No. AF233296); 
. soybean glycininl (see, GenBank No. XI 5 1 2 1); Arabidopsis oleosin (see, GenBajik No. 
Z17657); maize oleosin 18kD (see, GenBankNo. J05212; Lee (1994) Plant Mol. Biol 

5 26:1981-1987); and the gene encoding low molecular weight sulfur rich protein from 
soybean (see, Choi (1995) Mol Gen. Genet, 246:266-268). Yet other exemplary 
reproductive tissue promoters include those derived from the following endosperm genes: 
Arabidopsis Fie (see, GenBankNo. AF129516); Arabidopsis Mea; Arabidopsis Fis2 (see, 
GenBankNo. AF096096); rice Glul (see, GenBankNo. M28156); and rice 26 kDa 

10 globulin (see, GenBank No. D50643). Yet other exemplary reproductive tissue promoters 
include those derived from the following genes: ovule BELl (see, Reiser (1995) Cell, 
83:735-742; Ray (1994) Proc, Natl Acad, Set USA, 91:5761-5765; GenBankNo. 
U39944); central cell FIE (see, GenBankNo. AF129516); floy/or pximordia Arabidopsis 
APETALAl (a.k.a. API) (see, Gustafson-Brown (1994) Cell, 76:131-143; Mandrel 

15 (1992) Nature, 360:273-277); fLovfct Arabidopsis AP2 (see, Jofuku (1994) Plant Cell 
6:121 1-1225); Arabidopsis flower ufo, expressed at the junction between sepal and petal 
primordia (see, Bossinger (1996) Development, 122:1093-1 102); fruit-specific tomato E8; 
a tomato gene expressed during fruit ripening, senescence and abscission of leaves and 
flowers (see, Blume (1997) Plant J., 12:731-746); pistil-specific potato SBC2 (see. Picker 

20 (1997) Plant Mol Biol, 35:425-431); Arabidopsis DMCl (see, GenBankNo. U76670); 
and Arabidopsis DMTl (see, Choi (2002) Cell, 1 09). 

Suitable vegetative tissue promoters include those derived from the following 
genes: pea BIec4, active in epidermal tissue of vegetative and floral shoot apices of 
transgenic alfalfa; potato storage protein patatin gene (see, Kim (1994) Plant Mol Biol, 

25 26:603-615; Martin (1997) Plants, 1 1 :53-62); root Agrobacterium rhizogenes 0RF13 
(see, Hansen (1997) Mol Gen. Genet, 254:337-343); genes active during taro corm 
development (see, Bezerra (1995) Plant Mol Biol, 28:137-144); de Castro (1992) Plant 
Cell, 4:1549-1559); root meristem and immature central cylinder tobacco gene TobRB7 
(see, Yamamoto (1991) Plant Cell, 3:371-382); ribulose biphosphate carboxylase genes 

30 RBCS 1 , RBCS2, and RBCS3 A expressed in tomato leaves (see, Meier (1 997) FEBS 

Lett, 415:91-95); ribulose biphosphate carboxylase genes expressed in leaf blade and leaf 
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sheath mesophyll cells (see, Matsuoka (1994) Plant 1, 6:31 1-319); leaf chlorophyll a/b 
binding protein (see e.g., Shiina (1997) Plant Physiol , 1 1 5:477-483; Casal (1 998) Plant 
Physiol, 116:1533-1538); Arabidopsis AtmybS, expressed in developing leaf trichomes, 
stipules, in epidermal cells on the margins of young rosette and cauline leaves, and in 
immature seeds between fertilization and the 16 cell stage of embryo development and 
persists beyond the heart stage (see, Li (1996) FEBSLeU., 379:1 17-121); a maize leaf- 
specific gene described by Busk (1997) Plant J,, 1 1 :1285-1295; 
"SHOOTMERISTEMLESS" and "SCARECROW" genes active in developing shoot or 
root apical meristems (see e.g., Di Laurenzio (1996) Cell, 86:423-433; Long (1996) 
Nature, 379:66-69); 3-hydroxy-3-methylglutaryl coen2yme A reductase HMG2, 
expressed in meristematic tissue, and floral reductase HMG2, expressed in meristematic 
and floral (e.g., secretory zone of the stigma, mature pollen grains, gynoecium vascular 
tissue, and fertilized ovules) tissues (see, Enjuto (1995) Plant Celh 7:517-527); meristem 
knl -related genes from maize and other species (see, Granger (1996) Plant MoL Biol, 
31:373-378; Kerstetter (1994) Plant Cell, 6:1877-1887; Hake (1995) Philos. Trans. R. 
Sac. Land B. Biol Sci, 350:45-51; Lincohi (19^) Plant Cell, 6:1859-1876); and 
constitutive Cauliflower mosaic virus 35S. 

Cell type or tissue-specific promoters derived from viruses also can be suitable 
regulatory elements. Exemplary viral promoters include: the tobamovirus subgenomic 
promoter (Kumagai (1995) Proc. Natl Acad Set USA, 92:1679-1683; the phloem- 
specific tungro bacilliform virus (RTB V) promoter; the cassava vem mosaic virus 
(CVMV) promoter, expressed most strongly in vascular elements, leaf mesophyll cells, 
and root tips (Verdaguer (1996) Plant Mol Biol, 31:1 129-1 139). 

In some embodiments, a nucleic acid construct of the invention contains a 
promoter and a recognition site for a transcriptional activator, both of which are operably 
linked to the coding sequence for a chimeric polypeptide. In these embodiments, 
transgenic organisms or mixtures of cells that express the chimeric polypeptide contaia a 
second nucleic acid construct that encodes a transcriptional activator. A transcriptional 
activator is a polypeptide that binds to a recognition site on DNA, resulting in an increase 
in the level of transcription from a promoter associated in cis with the recognition site. 
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The recognition site for the transcriptional activator polypeptide is positioned with 
respect to the promoter so that upon binding of the transcriptional activator to the 
recognition site, the level of transcription from the promoter is increased. The position of 
the recognition site relative to the promoter can be varied for different transcriptional 

5 activators, in order to achieve the desked increase in the level of transcription. 

Many transcriptional activators have discrete DNA binding and transcription 
activation domains. The DNA binding domain(s) and transcription activation domain(s) 
of transcriptional activators can be synthetic or can be derived from different sources 
(e.g., two-component system or chimeric transcriptional activators). In some 

1 0 embodiments, a two-component system transcriptional activator has a DNA binding 
domain derived from tlie yeast gal4 gene and a transcription activation domain derived 
from the VP16 gene of herpes simplex virus. In other embodiments, a two-component 
system transcriptional activator has a DNA binding domain derived from a yeast HAPl 
gene and the transcription activation domain derived from VP16. Populations of 

15 transgenic organisms or cells having a first nucleic acid construct that encodes a chimeric 
polypeptide and a second nucleic acid construct that encodes a transcriptional activator 
polypeptide can be produced by transformation, transfection, or genetic crossing. See 
e.g., WO 97/31064. 

A nucleic acid encoding a novel polypeptide of the invention can be obtained by, 
20 for example, DNA synthesis or the polymerase chain reaction (PGR). PGR refers to a 
procedure or technique in which target nucleic acids are amplified. PGR can be used to 
amplify specific sequences from DNA as well as KNA, including sequences from total 
genomic DNA or total cellular RNA. Various PGR methods are described, for example, 
in PCR Primer: A Laboratory Manual, Dieffenbach, G. & Dveksler, G., Eds., Gold 
25 Spring Harbor Laboratory Press, 1995. Generally, sequence information from the ends of 
the region of interest or beyond is employed to design oligonucleotide primers that are 
identical or similar in sequence to opposite strands of the template to be amplified. 
Various PCR strategies are available by which site-specific nucleotide sequence 
modifications can be introduced into a template nucleic acid. 
30 Nucleic acids of the present invention can be detected by methods such as 

ethidium bromide staining of agarose gels. Southern or Northem blot hybridization, PCR 
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or in situ hybridizations. Hybridization typically involves Southern or Norttiem blotting 
(see, for example, sections 9.37-9.52 of Sambrook et al., 1989, "Molecular Cloning, A 
Laboratory Manual", 2"^ Edition, Cold Spring Harbor Press, Plainview; NY). Probes 
should hybridize under high stringency conditions to a nucleic acid or the complement 
thereof. High stringency conditions can include the use of low ionic strength and high 
temperature washes, for example 0.015 M NaCl/0.0015 M sodium citrate (O.IX SSC), 
0.1% sodium dodecyl sulfate (SDS) at 65°C. In addition, denaturing agents, such as 
formamide, can be employed during high stringency hybridization, e.g., 50% formamide 
with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium 
phosphate buffer at pH 6.5 with 750 mM NaCl, 75 mM sodium citrate at 42°C. 

Eukaryotic Organisms 

The term "host" or "host cell" includes not only prokaryotes, such as E, coli, but 
also eukaryotes, such as fungal, insect, plant and animal cells. Animal cells include, for 
example, COS cells and HeLa cells. Fungal cells include yeast cells, such as 
Saccharomyces cereviseae cells, A host cell can be transformed or transfected with a 
DNA molecule (e.g., a vector) using techniques known to those of ordinary skill in this 
art, such as calcium phosphate or lithium acetate precipitation, electroporation, 
lipofection and particle bombardment. Host cells containing a vector of the present 
invention can be used for such purposes as propagating the vector, producing a nucleic 
acid (e.g., DNA, RNA, antisense RNA) or expressing a polypeptide or fragments thereof 

Plants 

Among the eukaryotic organisms featured in the invention are plants containing an 
exogenous nucleic acid that encodes a polypeptide of the invention, e.g., nucleic acids 
encoding a polypeptide having an amino acid sequence as shown in Table 9 or in Table 
11. 

Accordingly, a method according to the invention comprises introducing a nucleic 
acid construct as described herein into a plant. Techniques for introducing exogenous 
nucleic acids into monocotyledonous and dicotyledonous plants are known in the art, and 
include, without limitation, ^groiacr^riwrn-mediated transformation, vhral vector- 
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mediated transformation, electroporation and particle gun transformation, e.g., U.S. 
Patents 5,204,253 and 6,013,863. If a cell or tissue culture is used as the recipient tissue 
for transformation, plants can be regenerated from transformed cultures by techniques 
known to those skilled in the art. Transgenic plants can be entered into a breeding 

5 program, e.g., to introduce a nucleic acid encoding a polypeptide into other lines, to 
transfer the nucleic acid to other species or for further selection of other desirable traits. 
Alternatively, transgenic plants can be propagated vegetatively for those species amenable 
to such techniques. Progeny includes descendants of a particular plant or plant line. 
Progeny of an instant plant include seeds formed on Fi, F2, F3, and subsequent generation 

10 plants, or seeds formed on BCi, BC2, BC3, and subsequent generation plants. Seeds 

produced by a transgenic plant can be grown and then selfed (or outcrossed and selfed) to 
obtain seeds homozygous for the nucleic acid encoding a novel polypeptide. 

A suitable group of plants with which to practice the invention include dicots, 
such as safflower, alfalfa, soybean, rapeseed (high erucic acid and canola), or sunflower, 

15 Also suitable are monocots such as com, wheat, rye, barley, oat, rice, millet, amaranth or 
sorghum. Also suitable are vegetable crops or root crops such as potato, broccoli, peas, 
sweet com, popcorn, tomato, beans (including kidney beans, lima beans, dry beans, green 
beans) and the like. Thus, the invention has use over a broad range of plants, including 
species from the g&nereLAnacardium, Arachis, Asparagus^ Atropa, Avena, Brassica, 

20 Citrus, CitruUus, Capsicum, Carthamus, Cocos, Coffea, Cucumis, Cucurbita, Daucus, 
Elaeis, Fragaria, Glycine, Gossypium, Helianthus, Heterocallis, Hordeum, Hyoscyamus, 
Lactucq, Linum, Lolium, Lupinus, Lycopersicon, Malus, Manihot, Majorana, Medicago, 
Nicotiana, Olea, Oryza, Panicum, Pannesetum, Persea, Phaseolus, Pistachia, Pisum, 
Pyrus, Prunus, Raphanus, Ricinus, Secale, Senecio, Sinapis, Solanum, Sorghum, 

25 Theobromus, Trigonella, Triticum, Vicia, Vitis, Vigna and Zea, 

Chimeric polypeptides of the invention can be expressed in plants in a cell- or 
tissue-specific manner according to the regulatory elements chosen to include in a 
particular nucleic acid construct present in the plant. Suitable cells, tissues and organs in 
which to express a chimeric polypeptide of the invention include, without limitation, egg 

30 cell, central cell, synergid cell, zygote, ovule primordia, nucellus, integuments, 

endothelium, female gametophyte cells, embryo, axis, cotyledons, suspensor, endospenn. 
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seed coat, ground meristem, vascular bundle, cambium, phloem, cortex, shoot or root 
apical meristems, lateral shoot or root meristems, floral meristem, leaf primordia, leaf 
mesophyll cells, and leaf epidermal cells, e.g., epidermal cells involved in forming the 
cuticular layer. 

5 

Fxmgi 

Other eukaryotic organisms featured in the invention are fungi containing an 
exogenous nucleic acid that encodes a chimeric polypeptide of the invention, e.g,, nucleic 
acids encoding a polypeptide having the amino acid sequence as sho>vn in Table 9 or in 
10 Table 11. 

Accordingly, a method according to the invention comprises introducing a nucleic 
acid construct as described herein into a fungus. Techniques for introducing exogenous 
nucleic acids into many fungi are known in the art, e.g., U.S. Patents 5,252,726 and 
5,070,020. Transformed fungi can be cultured by techniques known to those skilled in 

15 the art. Such fungi can be used to introduce a nucleic acid encoding a polypeptide into 
other fungal strains, to transfer the nucleic acid to other species or for further selection of 
other desirable traits. 

A suitable group of fungi with which to practice the invention include fission 
yeast and budding yeast, such as Saccharomyces cereviseae, S.pombe, S. carlsbergeris 

20 and Candida albicans. Filamentous fungi such as Aspergillus spp. and Penicillium spp. 
also are useful. 

Animals 

Other eukaryotic organisms featured in the invention are animals (e.g., insects 
25 such mosquitoes and flies; fish; and non-human mammals such as rodents, bovines and 
porcines) that contain an exogenous nucleic acid that encodes a chimeric polypeptide of 
the invention, e.g., nucleic acids encodmg a polypeptide having the amino acid sequence 
as shown in Table 9 or in Table 1 1 . A variety of techniques known in the art can be used 
to generate such transgenic animals. Such techniques typically involve generating a 
30 plmrality of animals whose genomes can be screened for the presence or absence of the 
transgene. For example, a transgene can be introduced into a non-human mammal by 
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pronuclear microinjection (U.S. Patent No. 4,873,191), retrovirus mediated gene transfer 
into germ lines (Van der Putten et al., Proc. Natl. Acad. Sci. USA, 82:6148, 1985), gene 
targeting into embryonic stem cells (Thompson et al.. Cell 56:313, 1989), electroporation 
of embryos (Lo, Mol. Cell Biol., 3:1803, 1983), and transformation of somatic cells in 
5 vitro followed by nuclear transplantation (Wilmut et al., Nature. 385(6619):810-813, 
1997; and Wakayama et al., Nature, 394:369-374, 1998). When using mice to malce a 
transgenic animal, suitable genetic backgrounds for use in making founder lines include, 
without limitation, C57B6, SJL/J, FVB/N, 129SV, BALB/C, C3H, and hybrids thereof 

10 Methods of gene profiling 

In another aspect, the invention provides a method in which gene function can be 
determined from changes in an organism's gene expression profile. The method involves 
expressing a chuneric polypeptide in a specific cell type, tissue or organ in an organism or 
population of organisms. The organism can be, for example, an animal, plant, or fimgus. 

1 5 The term "specific cell type" refers to cells that have one or more characteristics that 
distinguish them fi'om the other cells in an organism, or from other cells in a mixture of 
cells. Distinguishing features can include, for example, physical location, cell division 
rate, developmental stage, dtBferentiation status, macromolecular composition, gene 
expression profile, protein expression profile, particular cell type, or presence or absence 

20 of a particular polypeptide. Specific cell types can be fonnd m an organ, tissue, or tissue 
or cell culture, e.g., egg cells from embryo sacs, scutellar cells of a mature kemel, cells 
containing seed storage proteins from cotyledons and rapidly dividing fibroblasts from 
skin. Specific cell types also can be found in more than one organ, tissue, or tissue or cell 
culture, e.g., meristematic cells from plant shoot and root apices, and mucosal cells from 

26 the large intestine and the nasal cavity. 

The method typically involves introducing an exogenous nucleic acid encoding 
the chimeric polypeptide into an organism. In some embodiments, the exogenous nucleic 
acid contains a regulatory element that directs expression of the chimeric polypeptide in 
specific cell types. In other embodiments, the exogenous nucleic acid is situated in the 

30 genome of the target organism such that expression of the chimeric polypeptide is 



30 



wo 03/000715 



PCT/US02/19750 



governed by native transcriptional regulatory elements (e.g., a native cell type-specific 
promoter). 

In yet other embodiments, the nucleic acid construct encoding a chimeric 
polypeptide contains a recognition site for a transcriptional activator. In these 
5 embodiments, transgenic organisms or mixtures of cells that express the chimeric 
polypeptide contain a second nucleic acid construct that encodes the transcriptional 
activator, and one or more regulatory elements that facilitate expression of the 
transcription activator in a specific cell type. Thus, in these embodiments, the exogenous 
transcription activator is expressed in specific cells, and in turn activates transcription of 

10 the chimeric polypeptide in those cells. Populations of transgenic organisms or cells 
having a first nucleic acid construct that encodes a chimeric polypeptide and a second 
nucleic acid construct that encodes a transcriptional activator can be produced by 
transformation, transfection, or genetic crossmg. 

Cell ^e-specific expression of a chimeric polypeptide can alter an organism's 

15 gene e;q)ression profile (i.e., the cell types in which particular sets of genes are 

transcribed, and the level at vMch such genes are transcribed) relative to organisms that 
do not express the chimeric polypeptide. Alterations in gene e^qpression profile can be 
manifested in changes in the macromolecular (e.g., RNA, protem, chemical) composition 
of organisms that express a chimeric polypeptide in a cell-specific manner. The skilled 

20 artisan can measure the RNA or protein composition of specific cells using routine 
techniques such as, for example, thin layer or gas-liquid chromatography, gel 
electrophoresis of protein extracted firom appropriate cells, and gel electrophoresis of 
RNA extracted fi-om appropriate cells. The skilled artisan can measure the expression of 
particular genes or proteins using the above-mentioned methods alone or in combination 

25 with, for example, protein immunochemistry or nucleic acid hybridization assays using 
electrophoretically or chromatographically separated macromolecules, microairay 
analysis, or specific Rt-PCR. The above-described techniques can provide quantitative, 
semi-quantitative or qualitative detection of gene expression. Alterations in gene 
expression profile can be detected by comparing the gene expression profiles of, for 

30 example, a transgenic organism that expresses the chimeric polypeptide in specific cells 
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and an organism that lacks the nucleic acid construct or does not express the chimeric 
polypeptide. 

Once the transcriptional and/or translational activity of a set of genes has been 
determined in a specific cell type and/or at a desired time, the function of the set of genes 
5 can be assigned to particular developmental, physiological and/or biochemical pathways. 
In addition, a microarray containing the set of genes, or a subset thereof, can be made. 
See e.g., U.S. patents 5,424,186 and 6,156,501. The microarray can contain a plurality of 
oligonucleotides, each oligonucleotide representing a portion of the sequence of one gene 
from the set of genes. Each of the oligonucleotides is coupled to a soUd substrate at a 

10 known location. The substrate can be silica, polymeric materials, glass, beads, slides or 
chips. Such microarrays can be used, for example, to determine the level of transcription 
of the set of genes in other cell types and thereby identify genes whose transcription is 
repressed solely in the specific cell type. Such genes are suitable targets for further 
manipulation. For example, genes that are inactivated solely during fruit maturation can 

15 be targeted for a modification that results in continued expression of such genes for an 
additional period of time, in order to delay fmit ripening and/or increase fruit size. 

Methods for modulating gene expression 

In another aspect, the invention provides methods for modulating gene expression 
20 in an organism. Modulating gene expression involves expressing a chimeric polypeptide 
in specific cells in an organism or population of organisms. The organism can be, for 
example, yeast or a plant. 

An exogenous nucleic acid encoding a chimeric polypeptide is introduced into an 
organism. In some embodiments, the exogenous nucleic acid contains a regulatory 
25 element that directs expression of the chimeric polypeptide in specific cells or tissues. In 
other embodiments, the exogenous nucleic acid is situated in genome of the target 
organism such that expression of the chimeric polypeptide is governed by native 
transcriptional regulatory elements (e.g., a native cell type or tissue-specific promoter). 
In yet other embodiments, the nucleic acid construct that encodes a chimeric 
30 polypeptide contains a recognition site for a transcriptional activator. In these 
embodiments, transgenic organisms or mixtures of cells that express the chimeric 
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polypeptide contain a second nucleic acid construct that encodes a transcriptional 
activator. The second nucleic acid construct contains a regulatory element that directs 
expression of the transcription activator in specific cells. Thus, in these embodiments, the 
exogenous transcription activator is expressed in specific cells or tissues, and in turn 
activates transcription of the chimeric polypeptide in those cells. Populations of 
transgenic organisms or cells having a nucleic acid construct that encodes a chimeric 
polypeptide and a nucleic acid construct that encodes a transcriptional activator 
polypeptide can be produced by transformation, transfection, or genetic crossing. 

By expressing a chimeric polypeptide in specific cells, it is possible to modulate 
gene expression in an organism (e.g., by derepressing genes that normally are 
transcriptionally inactive). An organism or cell exhibiting modulated gene expression can 
have compositional (e.g., protein, nucleic acid, lipid, saccharide), developmental and 
phenotypic alterations relative to organisms or ceUs that do not express the chimeric 
polypeptide. For example, modulated gene expression in plants can alter seed 
development, seed yield, seed composition, endosperm development, embryo 
development, cotyledon development, seed size, flowering time, plant size, leaf size, leaf 
shape, plant fertility, apical dominance, floral organ identity, root development, or organ 
composition. In plants, cell type-specific expression of chimeric polypeptides also can 
cause fertilization independent endosperm development and fertilization independent seed 
development 

In some embodiments, seed development can be altered by expressing a chimeric 
polypeptide in the developing ovule or seed of a plant. In such embodunents, the 
chimeric polypeptide can modulate endosperm and/or embryo development; developing 
seed in such plants can exhibit altered endosperm and/or altered embryo development; 
and plants can exhibit altered seed yield (by number and / or mass). The effects of 
expressmg a chimeric polypeptide on seed development can be enhanced when DNA 
methylation is reduced. DNA methylation can be reduced, e.g., by mutation of or 
antisense nucleic acid interference with a gene encoding a DNA methyltransferase. 
ExCTiplary plant DNA methyltransferase genes include Metl, Cmt3, Zmet2, Drml, Drm2 
(Vielle-Calzada et al. (1999) Genes & Dev, 13:2971-2982; Richards et al. (2000) US 
patent 6,153,741; Dellaporta and Chen (2000) US Patent 6,01 1,200; Vinkenoog et al. 
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(2000) The Plant Cell 12:2271-2282; Luo et al. (2000) Proa Natl Acad. Set USA 
97:10637-10642; Jackson et al. (2002) Nature 416:556-560). DNA methylation also can 
be reduced by mutation of or antisense nucleic acid interference with certain genes that 
encode chromatin associated proteins that have a role in DNA methylation. Such genes 
5 include Ddml (see Jeddeloh et al. (1999) Nature Genetics 22:94-97) and Kyp (see 
Jackson et al. (2002) Nature 416:556-560). In these embodiments, plants can have 
altered seed yield by mass. Mutations of or antisense nucleic acid interference with other 
genes, such as Mom (see Amedeo et al. (2000) Nature 405:203-206), that have a post- 
DNA methylation role in DNA methylation state also can enhance the effects of 

10 expressing a chimeric polypeptide on seed development. 

In some embodiments, the exogenous nucleic acid contains a regulatory element 
that directs expression of the chimeric polypeptide to specific cells or tissues. 

In yet other embodiments, the nucleic acid construct tliat encodes a chimeric 
polypeptide contams a recognition site for a transcriptional activator. In these 

1 5 embodiments, transgenic organisms or mixtures of cells that express the chinieric 
polypeptide contain a second nucleic acid construct that encodes a transcriptional 
activator. The second nucleic acid constract contains a regulatory element that directs 
expression of the transcription activator in specific cells. Thus, in these embodiments, the 
exogenous transcription activator is expressed in specific cells or tissues, and in turn 

20 activates transcription of the chimeric polypeptide in those cells. Populations of 
transgenic organisms or cells having a nucleic acid construct that encodes a chimeric 
polypeptide and a nucleic acid construct that encodes a transcriptional activator 
polypeptide can be produced by transformation, transfection, or genetic crossing. 

25 Met/wds of making sterile plants 

In another aspect, the invention provides methods for making sterile plants by 
introducing an exogenous nucleic acid encoding a chimeric polypeptide. In some 
embodiments, the exogenous nucleic acid contains a regulatory element that directs 
expression of the chimeric polypeptide in reproductive cells. In other embodiments, the 

30 exogenous nucleic acid^is situated in genome of the target organism such that expression 
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of the chimeric polypeptide is governed by a native transcriptional regulatory element that 
facilitates transcription in reproductive cells. 

In yet other embodiments, the nucleic acid construct that encodes a chimeric 
polypeptide contains a recognition site for a transcriptional activator. In these 
embodiments, transgenic plants that express the chimeric polypeptide contain a second 
nucleic acid construct that encodes a transcriptional activator and one or more regulatory 
elements that facilitate expression of the transcription activator in plant reproductive cells. 
Thus, in these embodiments, the transcription activator is expressed in plant reproductive 
cells, which in turn activates transcription of the chimeric polypeptide in reproductive 
cells. Transformation and/or genetic crosses, for example, can produce plants that 
contain a nucleic acid construct that encodes a chimeric polypeptide and a nucleic acid 
construct that encodes a transcriptional activator polypeptide. Expressing a chimeric 
polypeptide in plant reproductive cells can affect the affect the reproductive and / or 
developmental processes and prevent the production of viable embryos from female 
reproductive tissues. 

The invention is further described in the following examples, which do not limit 
the scope of the invention described in the claims. 

EXAMPLES 

Example 1: Pofypepiides having histone acetyltransferase activity. 

Polypeptides are tested for histone acetyltransferase activity using assays 
previously described (see Brownell, J. and AUis, C. Proc. Natl Acad Sci USA, 
92:6364-6368 (1995); Brownell, J. E. et al. Cell, 84:843-851 (1996)). Codmg 
sequences of candidate polypeptides are cloned into an appropriate e3q)ression 
vector, the expression vector is introduced into a bacterial host strain, expression of 
the gene is induced and protein extract is prepared. The extracts are incubated with 
calf thymus histones and [^BQ-acetyl-Coenzyme A. Radioactivity transferred to the 
histone substrate in an extract-dependent manner is quantified by liquid scintillation 
counting. Candidate polypeptides that transfer radioactivity to the histone substrate 
compared to positive controls (extracts from hosts expressing known HAT 
polypeptides) and negative controls (extract alone, histones without extract and 
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comparable vector-only) have HAT activity. Alternatively, plant HAT activity is 
tested by determining whether expression of the corresponding cDNA is sufficient to 
rescue a yeast HAT mutant. 

5 Example 2: Polypeptides having histone deacetylase activity. 

Polypeptides are tested for histone deacetylase activity using assays 
previously described by van der Vlag, J. and Otte A.P. in Nature Genetics, 25:474- 
478 (1999). Coding sequences of candidate polypeptides are cloned into an 
appropriate expression vector, the expression vector is introduced into a bacterial 

10 host strain, expression of the gene is induced and protein extract is prepared. The 
extracts are incubated with [^H]-acetylated histones or histone segments for 3-6 
hours at 37 °C under shaking conditions in a buffer containing 20 mM Tris»-HCl, pH 
7.4, ad 50 mM NaCl. The reaction is stopped by adding 7.7 mM HCl/1 .2M acetic 
acid, and extracted with ethyl acetate. After centrifiigation, the ethyl acetate fraction 

16 is counted in a liquid scintillation counter. Candidate polypeptides that remove 

radioactivity from the histone substrate compared to positive controls (extracts from 
hosts expressing known HDAC polypeptides) and negative controls (extract alone, 
histones without extract, vector-only, and parallel trichostatin A-containing 
reactions) have HDAC activity. Alternatively, HDAC activity is tested by 

20 determming whether expressuig a candidate HDAC polypeptide (e.g., using an 

nucleic acid construct containing the corresponding cDNA clone) in a yeast HDAC 
mutant can rescue the mutant phenotype. 

Example 3: Chimeric HAT Nucleic Acid construct pFIE-lSCr-ESAL 

25 The chimeric HAT gene construct was constructed using standard molecular 

biology techniques. The construct contains the coding sequence for the Arabidopsis 
FIE polypeptide and the coding sequence for a truncated Arabidopsis HAT 
polypeptide Unked in frame by a DNA fragment encoding fifteen glycine residues. 
The FIE coding sequence was obtained from plasmid pFIE3.6. The Arabidopsis FIE 

30 polypeptide is a homolog of the Drosophila polycomb protein extra sex combs (esc) 
(see Chad et al.. Plant Cell, 1 1 :407-415 (1999)). The Arabidopsis HAT polypeptide 
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AtESAl is a homolog of the yeast ESAl polypeptide. pFIE-15G-ESAl contains 5 
binding sites for the DNA binding domain of the Gal4 transcription factor (UASgam) 
located 5' to a CaMV35S minimal promoter. The CaMV35S minimal promoter is 
located 5' to the FIE coding sequence. A DNA fragment encoding fifteen glycine 
5 residues is present, in frame, at the 3' end of the UAS-FIE DNA sequence, followed, 
in frame, by a DNA fragment encoding mArabidopsis homologue of ESAl. 

The coding sequence for a truncated AtESAl was fused to the 3'-end of the 
FIE coding sequence by fusion PGR (Levin HL, M?/. CellBioL, 15:3310-3317 
(1995)). Two intermediate PGR products were generated for this purpose. The first 

10 intermediate product contained a coding sequence for FIE having a 15-glycine spacer 
fused to its carboxy-terminus. This product was generated using two synthetic 
oligonucleotides and a pFIE3.6 DNA template. Similarly, the second intermediate 
PGR product contained a coding sequence for AtESAl having a 15 glycine spacer 
fused to its amino-terminus. This product was generated using two synthetic 

15 oligonucleotides and a pAtESAl cDNA template. The two intermediate products 
were then fused to each other in a final round of PGR using a set of synthetic primers 
that introduced a BgUI site at the 5' end of the fusion and a Xhol site at the 3' end of 
the fusion. The resultant PGR product contained a chimeric sequence encoding a 
fusion peptide in which the anuno-terminus of the FIE coding sequence is linked by 

20 a 15-gIycine spacer to the carboxy-terminus the truncated AtESAl coding sequence. 
This fiaial PGR product was digested with Bgin and Xhol and cloned into the Ti- 
plasmid vector pGRS304-5UAS which was previously digested with BamHI and 
Xhol. TheresultmgplasmidwasnamedpGRS304-5UAS-FIE-15G-ESAl. The 
transgene was designated FIE-15G-ESA1. The amino acid sequence of the chimeric 

25 polypeptide encoded by the transgene is shown in Table 9 and the nucleotide 
sequence of the transgene is shown in Table 10. 

Thus, pGRS304-5UAS-FIE-15G-ESAl encodes a chimeric polypeptide 
having an Arabidopsis thaliana FIE polypeptide and a truncated Ardbidopsis 
thaliana HAT polypeptide, linked by an intervening peptide spacer of 15 glycine 

30 residues. The plasmid contains 5 copies of the Gal4 upstream activator sequence 
(UASgam) located 5' and operably linked to the CaMV35S minimal promoter. This 



37 



wo 03/000715 



PCT/US02/19750 



in turn is located 5' and operably linked to the FIE-15G-ESA1 coding sequence. The 
binding of a transcription factor that possesses a Gal4 DNA binding domain to the 
Gal4 UAS is necessary for transcriptional activation. 

5 Example 4: Chimeric HAT Nucleic Acid construct pMEA-15G-ESAl. 

The chimeric HAT gene construct pMEA-1 5G-ESA1 was constructed using 
standard molecular biology techniques. The construct contains the coding sequence for 
the Arabidopsis MEA polypeptide and the coding sequence for an Arabidopsis HAT 
polypeptide joined in frame by a DNA fragment encoding fifteen glycine residues. The 
10 MEA coding sequence was obtained fix)m plasmid pCBl(MEA-cDNA) (Kiyosue, T., et 
al. (1999) Proc. Natl Acad. Scl USA 96:4186-4191). The ^rafcidbpws MEA polypeptide 
is a homolog of the Drosophilapolycomb protein Enhancer of zeste (E(z)) (see 
Grossniklaus, U., et al. (1998) Science 280: 446-450.; Kiyosue, T., et al. (1999) Proc. 
Natl. Acad. Set. USA 96:4186-4191). The Arabidopsis HAT polypeptide AtESAl is a 
15 homolog of the yeast ESAl polypeptide. The pMEA-15G-ESAl plasmid contains 5 
binding sites for the DNA binding domain of the Gal4 transcription factor (UASoam) 
located 5' to a CaMV35S minimal promoter. The CaMV35S minimal promoter is located 
5' to the MEA coding sequence. A DNA fragment encoding fifteen glycine residues is 
present, in frame, at the 3' end of the UAS- MEA DNA sequence, followed, in frame, by 
20 a DNA fragment encoding an Arabidopsis homologue of ESAl . 

The AtESAl coding sequence was fused to the 3'-end of the MEA coding 
sequence by standard cloning techniques. Two intermediate PGR products were 
generated for this purpose. The first intermediate product contained the MEA coding 
sequence, flanked on either side by a BamHI restriction site. The BamHI sites were 
25 generated by incorporation into the PGR primer sequences. The first intermediate 
PGR product was digested with BamHI restriction enzyme and was cloned into the 
T-DNA expression vector pCRS304-5USAL at its unique BamHI site. The resultant 
plasmid was named pCRS304-5USAL-MEA-no 3'UTR. 

The second intermediate PGR product contained a coding sequence for 
30 AtESAl having a 1 5-glycine spacer fused to its amino terminus. The second 

intermediate PGR product was geiierated using two synthetic oligonucleotides and 
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the pAtES Al -cDNA template. The second PGR product ms flanked by a unique 
Smal site at its 5' end and by a unique Xhol site at its 3'end. These cloning sites 
were generated by incorporation into the PGR primer sequences. The second PGR 
product was digested with Smal and Xhol, and was cloned into the plasmid 

5 pGRS304-5USAL-MEA-no 3 'UTR between the unique restriction sites Smal and 
Xhol. The resultant plasmid was named pGRS304-5USAL-MEA-ESAL The 
transgene was designated MEA-15G-ESA1. The amino acid sequence of the 
chimeric polypeptide encoded by the transgene is shown in Table 1 1 and the 
nucleotide sequence of the transgene is shown in Table 12. 

10 Thus, pCRS304-5UAS-MEA-15G-ESAl encodes a chimeric polypeptide 

having mArabidopsis thaliana MEA polypeptide and mArabidopsis thaliana HAT 
polypeptide, joined by an intervening peptide spacer of 15 glycine residues. The 
plasmid contains 5 copies of the Gal4 upstream activator sequence (UASqam) 
located 5' and operably linked to the CaMV35S mmimal promoter. This in turn is 

1 6 located 5' and operably linked to the MEA-1 5G-ESA1 coding sequence. The 
binding of a transcription factor that possesses a Gal4 DNA bindmg domain to the 
Gal4 UAS is necessary for transcriptional activation. 

Examples: Transgenic plants. 

20 The pCRS304-5UAS-FIE-15G-ESAl plasmid and the pCRS304.5UAS-MEA- 

15G-ESA1 were independently introduced mto Arabidopsis WS by Agrobacterium 
tumefaciens mediated transformation using the floral infiltration technique essentially 
as described in Bechtold, N. et al, C A Acad Set Paris, 316:1 194-1 199 (1993). 
Several transformed plants, designated FE #1, FE #2, and ME #1, were selected for 

26 further study. The FIE-lSG-ESAl gene and the MEA-15G-ESA1 gene were then 
transcriptionally activated in specific target cells and tissues by crossing with two- 
component enhancer trap lines expressing a chimeric Gal4-VP16 activator protein 
(Haseloff et al.). In each activator line there is also a UASqau-GFP (green 
fluorescent protein) reporter gene. 
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Example 6: Two-component activation lines. 

The two-component system for activating target gene expression was first 
utilized in Drosophila and subsequently adopted for use in plants (see Bennett et al. 
(1998) US Patent No. 5,801,027; Liu et al. (1999) US Patent No. 5,968,793); Bennett 
5 et al. (2000) US Patent No. 6,127.606; Haseloff and Hodge (2001) US Patent No. 
6,255,558). The two-component system typically consists of two independent 
transcription units: an activator gene and a target gene. The activator gene encodes a 
transcriptional activator, a DNA binding protein gene such as Gal4-VP16, operably 
linked to a plant or animal promoter. The target gene has a protein coding sequence, 
10 such as a cDNA, operably linked to a promoter that has multiple copies of an 
upstream activator sequence element (UASqau) to which the transcriptional 
activator protein can bind. A target gene can be activated genetically by crossing a 
target gene-containing plant with an activation gene-containing plant (i.e., from an 
"activator line"). Alternatively, a target gene in a cell, tissue, or whole organism can 
15 be activated by transforming with an activation gene containing vector. 

An extensive collection of Arabidopsis two-component activation Unes has 
been produced and described by Dr. Jim Haseloff et al. (see 
http://www.plantsci.cam.ac.uk/HaseloflDlndexCatalogueJitml), and individual lines 
are available from the Arabidopsis Biological Resource Center (see 
20 http://www.^ra6z£/opjfs.org/abrc/haselo£f.htm). The activator Imes were produced 
using a T-DNA based enhancer trap strategy. In this system the Gal4-VP16 gene 
containing a CaMV35S minimal promoter can be transcriptionally activated when T- 
DNA is inserted proximal to an endogenous enhancer element. Enhancer activity is 
revealed by the trans-activation of a UASgai4-GFP reporter gene. Each activation 
25 line in the Haseloff collection contains one or more random T-DNA insertions in tiie 
Arabidopsis genome resulting in cell, tissue, or organ specific expression of a 
UASgal4-GFP reporter gene. The amino acid sequence of the GAL4uas-VP16 
activator protein is shown in Table 8. 

Six piiblicly available Arabidopsis two-component activation lines are 
30 described in Table 3 including J2592, J0661, Q2500, M0164, J2301 and J2921 . 
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Table 3. Activation line GFP expression pattern 



Haseloff activation line 
reference number and 
ABRC seed stock 
number 


Ovule and seed 
development 


Root 


Other 


GFP Intensity 


J2592(CS9180) 


Prefertilization 
ovule: ovule, 
funiculus and 
placenta. 
Developing seed: 
developing 
embryo and 
mature embryo. 


Root cap, 

root 
epidermal 

cells. 


Seedling: shoot and root 
epidermis, root cortex and 
root cap; hypocotyl, petiole 
epidermis, expanded 
cotyledon and leaf 
vasculature; stem epidermis 
and rosette leaf vasculature. 
Flower: sepal, petal and 

ovary vasculature; 
epidermis of mature sepal, 
petal, filament and ovary; 
stigma. 


Medium 


M0i64 (CS9307) 


Mature embryo. 


Root: weak 

patchy 
expression 
in 

vasculature 
of primary 
root 


Seedling: strong in shoot 

apex, rosette leaf and 
petiole vasculature. Weak 
in cotyledon vasculature. 
Silique: older siliques only. 


High 


Q2500(CS9135) 


Ovule: 
prefertilization 
ovule. Seed: 
chalazal end of 
developing seed, 
seed coat and 
young embryo. 


Root: 
vasculature. 


Seedling: vasculature of 
hypocotyl, expanded 
cotyledons and first leaves. 
Flower: petal vasculature, 
placenta. 


High 
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J0661 (CS9141) 


Developing seed: 
&iuculus, 
einbryo. 


Root: root 

V HOwUlClLiU V* 


Seedling: vasculature 
including root, hypocotyl, 
expanded cotyledons, 
rosette leaf vasculature, 
petiole. Cauline leaf 
vasculature. Flower: floral 
organ vasculature including 
pedicel, sepal, petal, 
filament and pistiL 


Medium 


J2921 (CS9194) 




Root: weak 

patchy 
expression 
in root; 
weak in 
root hair; 
strong in 
root 
vasculature 
and root tip; 
strong in 
junctions 

where 
lateral roots 
form. 


Flower: broad expression in 
epidermis of innnature 
buds; GFP decreases and 
becomes restricted to the 

ovary as the flower 
matures; weak expression 
in sepal and petal 
vasculature. 


Medimn 


J2301 (CS9173) 


Seed: seedcoat; 
GFP increases as 
silique matures; 
GFP detectable at 
suspensorendof 
einbryo. 


Root: very 
strong in 
root tip; 
weak in 

root cortex; 
root 

epidermis. 


Seedling: weak throughout 
seedling vasculature; strong 
in leaf trichomes; also 
detected in atrichoblasts. 
Flower: base of sepal and 
petal, ovary epidermis, 
style. 


Medium 



Each activation line displays a characteristic pattern of GFP accumulation in 
seedlings, vegetative organs and reproductive organs. GFP images are publicly 
available at http://wwv//plantsci.cam.ac.uk./Haselof5B'GAL4 and were independently 
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confirmed. For example, in line J2592 GFP expression was detectable in young 
seedlings in the shoot and root qjidermis, root cortex and root cap but not in the root 
apical meristem. GFP was also observed in seedling hypocotyl, petiole epidermis, 
expanded cotyledon and leaf vasculature. Low intensity GFP was detectable in the 
5 stem epidermis as well as in rosette leaf vasculature. GFP was observed in J2592 
flowers including the vasculature of the sepal, petal and ovary and in the epidermis 
of the mature sepal, petal, filament, ovary and in stigmatic papillae. A low level of 
GFP was detected in the pedicel. GFP was observed in pre-fertilization ovules and 
in the funiculus and placenta. In fertilized seed GFP was detectable in developing 

10 seeds and in mature embryos. GFP expression patterns were observed to vary in 
some progeny of J2592. 

In line MO 164 seedlings, GFP expression was observed in the vasculature of the 
primary root. No expression was detectable in the root cap. Relatively intense GFP 
expression was observed in the shoot apex and in leaf and petiole vasculature. Low 

1 5 intensity GFP e3q)ression was observed in the cotyledon vasculature. In developing seed 
GFP was detectable in embryos. GFP expression was not detectable in the seed coat or 
endosperm. 

Example 7: FIE-lSG-ESAl activated plan^. 

20 The FIE-1 5G-ESA1 triansgene was transcriptionally activated by crossing FE 

#1 and FE #2 plants with the GAL4-VP16 two-component activation lines described 
in Table 3. Reciprocal crosses were carried out using FE #1 and FE #2 plants with 
each 2-component activation line. The seed produced in such a cross are referred to 
as Fi seed. Thus, a first generation seed or plant produced by crossing FE #1 as the 

25 mother with J2592 as the pollen donor is referred to as Fi (FE #1 x J2592). A second 
generation seed or plant produced by self pollination of Fi (FE #1 x J2592) is 
referred to as F2 (FE #1 x J2592). Fi seed produced by crossing FE #1 and FE #2 
VJiHx the activation lines described above were collected from mature siliques or seed 
pods and dried using standard Arabidopsis procedinres. These siliques typically 

30 contained mature seed, abnormal seed and aborted ovules. 
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To analyze the effect of FIE-15G-ESA1 expression on Arabidopsis 
development Fi seed and seed from control plants were germinated on agar plates 
containing Ix Murashige and Skoog (MS) salts and 1 percent sucrose using standard 
Arabidopsis procedures. Germinated seedlings were scored 8 days after plating for 

5 germination efficiency, the presence or absence of the activator gene (inferred from 
GFP reporter gene activity) and seedling phenotypes. After phenotyping, Fi 
seedlings were transferred to soil at the four rosette leaf stage and then grown under 
standard Arabidopsis greenhouse conditions. Flowering plants were tested by PGR 
for the presence of the FIE-15G-ESA1 target gene and scored again for GFP 

10 expression. 

When Ime J2592 was used as the activation line, 86 percent of the Fi seeds 
germinated normally. Fi seedlings and plants exhibited both vegetative and 
reproductive effects of FIE- 15G-ESA1 activity. For example, cotyledons were 
observed to be incomplete, cupped, or missing in 30 percent of all seedlings 

15 analyzed- In some instances, extra cotyledons were observed. Hypocotyl 

development was perturbed in twelve percent of all Fi seedlings analyzed. Finally, 
twenty-four percent of Fi seedlings displayed stunted or missing petioles. 
Developmental abnormalities resulted in the loss of some seedlings from the study. 
These phenotypes were not observed in seedlings produced by selfing J2592, FE #1, 

20 or FE #2. Nor were these phenotypes observed in seedlings produced by crossing 
these parents with a wild type plant. The results indicate that activation of FIE-15G- 
ESAl by J2592 is responsible for these diverse traits. 

When activation line MOl 64 was used to activate FIE-1 5G-ESA1 , ninety- 
seven percent of the Fi (M0164 x FE #1) seed germinated successfiilly. Forty 

25 percent of Fi seedlings analyzed showed vegetative defects including cotyledons that 
were incomplete, cupped, or missing. In some instances, extra cotyledons were 
observed. Thus, the Fi seedling phenotypes induced using FIE-15G-ESA1 were not 
restricted to the J2592 activation line. 

Reproductive phenotypes for Fi plants containing activator and FIE-15G- 

30 ESAl target genes were analyzed as'described in Ohad, N,, et al. (1999) The Plant 
Cell 1 1:407-415; and in Fischer, R.L„ et al., (2001) US Patent 6,229,064. In brief, 
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developing siliques were sampled along the primaiy inflorescence proximal to distal 
relative to the rosette leaves. Within each silique, the seed were classified according 
to the color and the status of endospmn and embryo development. Since Fi seed are 
the product of genetic crossing, each silique that is produced by an Fi plant should 
5 contain a population of F2 seed that segregate for the activator and target genes and 
any resulting phenotype. Thus, each silique contains a population of wild type seed 
that provide a developmental reference for staging seed development and 
phenotyping. Seed phenotypes were recorded at two stages of seed development: (i) 
when the majority of seed in a silique were at the mature seed stage of embryo 

10 development, and (ii) at the torpedo to walking stick stage of embryo development. 

Eifect of FIE-15G-ESA1 sene activity on seed development: F2 seeds were 
produced by Fi plants through self-pollination, F2 (FE #1 x J2592) and F2 (FE #2 x 
J2592) seed development was characterized using a Zeiss dissecting microscope and 
a Zeiss Axioskope microscope as described by Ohad, N., et al., (1999) The Plant Cell 

15 11 :407-4 1 5 using standard Arabidopsis procedures. 

Activation of FIE-15G-ESA1 by J2592 altered embryo and seed development 
as shown in Table 4. Self-poUinated Fi (FE #1 x J2592) plants produced two classes 
of seed, (i) those exhibiting normal embryo and seed development, and (ii) those 
exhibiting abnormal seed and embryo development. Abnormal seed were found to 

20 contain an embryo whose development was arrested at the transition between heart 
and torpedo stages of development By contrast, endosperm production was not 
arrested in abnormal seed but was greater than or equal to that observed in normal 
seed. Thus, FIE-15G-ESA1 was observed to alter the balance between endosperm 
and embryo development within the seed. Most abnormal seed abort and degenerate 

25 into shrunken seed. The percent abnormal to normal seed ranged fix>m 25-62 percent 
(see Table 4). Similar results also were observed in Fi (FE #2 x J2592) plants. 
Similar results were observed when the reciprocal cross (i.e., J2592 x FE #1) was 
performed. FIE-15G-ESA1 also was observed to alter seed development when J0661 
was crossed with FE #1 . By contrast, no abnormal seed were detected in Fi plants 

30 produced by crossmg Q2500, J2301 or J2921 with FE #1 . In fact, more than 98 
percent of seed from self-polUnated FE #1, FE #2 and J2592 parental lines had no 
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visually observable abnormalities. Thus, the effect of FIE-15G-ESA1 activity on 
seed development appears to be promoter dependent. 



TABLE 4. 





?noty 




F1{FE#1 X J2692) Plant #29 




F1(FB*1XJ2592) PU 


int »3i 




Normal seeds 


Shrunken aborted seeds 






Normal seeds 


Shmnken aborted seeds 


Total 


225 


683 




Total 


264 


760 


Percent 


26.6 


74.4 






25.8 


74.2 
















F1(FE#1 X J2 


592) Plant #35 








Normal seeds 


Shrunken aborted seeds 


Total 


264 


760 


Percent 


25.8 


74.2 












F1(FE#1 X J2592) Plant #35 




F1(FE#1XJ2592) Plant #26 






Normal seeds 


Shrunken aborted seeds 




Normal seeds 


Shrunken aborted seeds 


Total 


361 


136 


Total 


379 


151 


Percent 




37.7 


Percent 




39.8 


STD* 




3.8 


STD 




4,1 














F1(FE#1XJ2 


t592) Plant #29 




F1(FE#1 X J2592) Plant #37 






Normal seeds 


Shninken aborted seeds 




Nomiai seeds 


Shmnken aborted seeds 


Total 


355 


115 




Total 


369 


125 


Percent 




32.40 




Percent 




33.9 


STD 




3.40 




STD 




2.7 
















fi(fe#ixj: 


2592) Plant #31 






F1(FE#1 X J2592) Plant #32 






Normal seeds 


Shrunken aborted seeds 






Normal seeds 


Shninken aborted seeds 


Total 


364 


111 




Total 


308 


192 


Percent 




30.5 




Percent 




62.3 


STD 




1 


STD 




5,2 



* STD = standard deviation 



Example 8: MEA-lSG-ESAl activated plants. 

The MEA-15G-ESA1 transgene was transcriptionally activated by crossing 
ME #1 with J2592, J0661 and Q2500 (see Table 5). Reciprocal crosses between ME 
#1 and each activation line also were made. Fi seeds were collected at maturity and 
stored under standard conditions. To analyze the effect of MEA-1 5G-ES Al 
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expression on Arabidopsis development Fi seed and seed from control plants were 
germinated on agar plates containing Ix MS salts and 1 percent sucrose. 
Subsequently, plants were phenotyped as described in Example 7. Mature plants 
were tested for Ihe presence of MEA-15G-ESA1 by PGR. 

5 When J2592 or Q2500 were crossed with ME #1 (pCRS304-5UAS-MEA- 

15G-ESA1 transformed plant #1) the Fi seedlings exhibited vegetative phenotypes 
shnilar to those caused by FIE-15G-ESA1 in Fi (FE #1 x J2592) and (FE #1 x 
M0164). For example, the cotyledons of Fi seedlings were observed to be 
incomplete, cupped, or missing. Hypocotyl development also was perturbed. These 

10 phenotypes were not observed in seedlings produced by the self pollination of J2592, 
Q2500 or ME #1. Thus, activation of MEA-lSG-ESAl by J2592 and Q2500 is 
responsible for these vegetative developmental effects. 



Table 5. 



WMJBMiSMiiiaie 


Miiliiiiiiliife^^^^ 














F1(ME#1xJ2592) 


Plant #1 












AVG 


17.4 


6 


14.4 


37.8 




% 


46.0 


15.9 


38.1 


100 














FHWIE#1xJ2592) 


Plant #2 












AVG 


17,2 


6.3 


15.9 


39.4 




% 


43.7 


16.0 


40.4 


100 














F1(ME#1xJ2592) 


Plant #3 












AVG 


23 


0 


15 


38 




% 


60.5 


0.0 


39.5 


100 














F1(ME#1 X J2592) 


Plant #22 












AVG 


18.5 


5.9 


14.8 


39,2 




% 


47.2 


15.1 


37.8 


100 














F1(ME#1xJ2592) 


Plant #24 












AVG 


24,6 


0 


15.5 


40,1 




% 


61.3 


0.0 


38.7 


100 














Control (GFP negative) 


Plant #26 












AVG 


43.2 


0 


0.2 


43.4 
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% 


99.5 


0.0 


0.5 


100 
















wmm 






•■^T^UH itivMl Mt<§r4:p. i -fl"' 




F1M2692XME#1) 


Plant #14 












AVG 


24.2 


0.1 


14.8 


39.1 




% 


61.9 


0.3 


37.9 


100 














F1(J2592XME#1) 


Plant #26 












AVG 


15 


6.2 


16 


37.2 




% 


40.3 


16.7 


43.0 


100 














F1(J2692XME#1) 


Plant #34 












AVG 


15.9 


6.7 


16.2 


37.8 




% 


42.1 


17.7 


40.2 


100 


























ri\i¥ic<ri Awwoai/ 


Plant #32 












AVG 


15.9 


0 


12.4 


28.3 




0/. 

/o 




0.0 


43.8 
















r 1 (uUOO i A IVIC Itm) 


Plant #19 












AVG 


17.8 


0 


17.5 


35.3 




/o 


50.4 


0.0 


49.6 
















F1(J0661 xME#1) 


Plant #27 












AVG 


18.6 


0 


15.4 


34 




% 


54.7 


0.0 


45.3 
















F1(J0661 xME#1) 


Plant #28 












AVG 


18.2 


0.1 


16.8 


35.1 




% 


51.9 


0.3 


47.9 





Example 9: Fertilization independent seed development 

To determine the frequency of post-fertilization seed abortion, siliques 
harvested at two weeks and at four weeks after self-poUmation were dissected, and 
5 wild-type and aborted seeds were counted. To test for fertilization-independent 
development, flower buds from plants that had not yet begun to shed pollen (i.e., 
stage 12 plants) (see Smyth, D.R., at al.. Plant Cell, 2: 755-761 (1990)) were opened, 
immature anthers were removed, and the flower bud was covered with a plastic bag. 
In some experiments, the silique was measured, dissected, and the number of seed- 
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like structures and degenerated ovules were counted after seven days. In some 
experiments, the silique was harvested and ovules and seed-like structures were 
phenotyped after 15 days. 

When immature Fi(J2592 x FE #1) flowers were emasculated and allowed 
5 to develop, seed-like structures were observed that were filled with endosperm but 
contained no embryo. This occurred in roughly 40 percent of the siliques analyzed. 
Thus, activation of FIE-15G-ESA1 by J2592 also can induce fertilization 
independent endosperm and seed development. 

10 Example 10: Profiling gene expression. 

This example demonstrates the use of chimeric polypeptides for RNA 
expression profiling. Gene expression in developing flowers from Fi (J2592 x FE 
#1) was compared to gene expression in flowers from activation line J2592 (see 
Table 3) and target line FE #1 using microarray expression analysis. All 

1 5 experiments were done in duplicate. 

Sample prevaration: Seeds of Fi (J2592 x FE #1) plants were sterilized in 
95% bleach for 1 minute and with 70% ethanol for 45 seconds and subsequently 
washed 5 times in sterile distilled deionized water and then plated on MS agar plates 
and left at 4 ''C for 4 days to be vemalized. Plates were placed in growth chamber 

20 with 16 hr Ught/8 hr. dark, 23 14,500-15,900 LUX, and 70% relative humidity 
for germination and growth. Seedlings were PCR-genotyped for the presence of the 
transgene and analyzed using dissectuig microscopy for GFP expression before they 
were transplanted individually into soil. Tissues harvested for RNA extraction 
consisted of compact terminal inflorescences. Each sample contained a population 

25 of sequentially produced and continuously developing flowers representing all stages 
of flower development from early floral primordial, to immature floral buds, to 
mature flowers up to and including two days after pollination. Samples were flash 
frozen m liquid nitrogen and stored at -80 °C until use. Total RNA was extracted 
using Qiagen KNeasy Kit with the protocol recommended by manufacture and the 

30 RNA was then dissolved in RNA-free water. 
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Approximately 10 ixg of the each RNA sample was used for amplification 
using MessageAmp™ aRNA Kit provided by Ambion, Inc. Poly(A+) mRNA was 
isolated using standard procedures (Poly(A) Quick mRNA Isolation Kit (Stratagene, 
La Jolla, California), and 2 |Lig from each sample was used to generate labeled probes 

5 for hybridization to microarray slides containing Arahidopsis cDNA sequences. The 
Arahidopsis microarray contained nucleic acid features representing 10,000 different 
Arahidopsis genes. Hybridi2ation experiments to detect differentially regulated 
genes were set up in pairs. For example, RNA from the Fi (J2592 x FE #1) plant 
was compared to RNA from either the Arahidopsis activation line J2592 or the 

1 0 Arahidopsis transgenic Ime FE#1 . Expression results are analyzed usmg standard 
software and procedures. 

Slide preparation: Microarray technology provides the ability to monitor mRNA 
transcript levels of thousands of genes m a single ejqperiment. These experiments 
simultaneously hybridize two differentially labeled fluorescent cDNA pools to glass 

15 slides that have been previously spotted with cDNA clones of the same species. Each 
arrayed cDNA spot will have a corresponding ratio of fluorescence that represents the 
level of disparity between the amount of respective mRNA species in the two sample 
pools. Thousands of polynucleotides can be spotted on one slide, and each experiment 
analyzes the expression pattern of thousands of mRNA species. 

20 The microarray utilizes a chemically coated microscope slide, referred herein as a 

"chip" with numerous polynucleotide samples arrayed at a high density. The coating with 
chemicals such as Poly-L-lysine allows for spotting DNA at high density by providing a 
hydrophobic surface, reducing the spreading of spots of DNA solution arrayed on the 
sUdes. Glass microscope slides (Gold Seal #3010 manufactured by Gold Seal Products, 

26 Portsmouth, New Hampshire, USA) were coated with a 0. P/oWA/' solution of Poly-L- 
lysine (Sigma, St Louis, Missouri) using the following protocol: 

Slides were placed in slide racks (Shandon Lipshaw #121). The racks were then 
put in chambers (Shandon Lipshaw #121). Cleaning solution was prepared by dissolving 
70g NaOH m 280 mL ddH20. 420 mL 95% ethauol was added. The total volume was 

30 700 mL (=^ 2 X 350 mL) and the solution was stirred until completely mixed. If the 
solution remained cloudy, ddH20 was added until the solution cleared. The cleaning 
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solution was poured into chambers with slide racks, and the chambers were covered with 
glass lids. The solution was mixed on an orbital shaker for 2 hr. The racks were quickly 
transferred to fresh chambers filled with ddH20 and were rinsed vigorously by plunging 
racks up and down. Rinses were repeated 4 times with fresh ddHaO each time, to remove 

5 NaOH-ethanol. Poly-L-lysine solution was prepared by adding 70 mL poly-L-lysine 
stock solution to 70 mL tissue culture PBS in 560 mL double-distilled deionized water 
using plastic graduated cylinders and beakers. Slides were transferred to polylysine 
solution and shaken on an orbital shaker for 1 hr. The rack was transferred to a fresh 
chamber filled with ddHaO, and was plunged up and down 5 times to rinse. The slides 

10 were centriftiged on microtiter plate carriers (p^er towels were placed below the rack to 
absorb liquid) for 5 min. @ 500 rpm. The slide racks were transferred to empty chambers 
with covers, and were dried in a 45 °C oven for 10 min. The slides were stored in a 
closed plastic slide box in the dark. Normally, the surface of lysine coated slides was not 
very hydrophobic immediately after this process, but became increasingly hydrophobic 

1 5 with storage. A hydrophobic surface helped ensure that spots did not run together while 
printing at high densities. After they ^ed for 10 days to a month the slides were ready 
for use. Stored slides that developed opaque patches, visible when held to the ligjit, can 
result in high background hybridization from the fluorescent probe and were not used. 
PCR amvlification ofcDNA clones: Polynucleotides were amplified fix)m 

20 Arabidopsis cDNA clones using one msert specific primer and one conunon primer that 
hybridized to ttie cloning site. The resulting 100 \i\ PCR reactions were purified with 
Qiaquick 96 PCR purification columns (Qiagen, Valencia, California, USA) and eluted in 
30 \\L of 5mM Tris. 8.5 jjL of the elution were mixed with 1.5 jaL of 20X SSC to give a 
final spotting solution of DNA in 3X SSC. The concentrations of DNA generated from 

25 each clone varied between lO-lOO ng/pl, but were usually about 50 ng/pl. 

Arravin2 PCR products on slides: Purified PCR products were spotted onto poly- 
L-Lysine coated glass slides using an arrangement of quill-tip puis (ChipMaker 3 spotting 
pins; Telechem International, Lie, Sunnyvale, California, USA) and a robotic arrayer 
(PixSys 3500, Cartesian Technologies, Irvine, California, USA). Approximately 0.5 nl of 

30 a prepared PCR product was spotted at each location to produce spots having a diameter 
of about 100 ^m. Spot were spaced 180 fmi to 210 pm center-to-center. Printing was 
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conducted in a chamber with relative humidity set at 50%. Slides containing maize 
sequences were purchased from Agilent Technology (Palo Alto, CA 94304). 

Slide processing: After arraying, slides were processed through a series of steps 
prior to hybridization: rehydration, UV cross-linking, blocking and denaturation. Slides 
5 were rehydrated by placing them over a beaker of warm (55 ''C) water (DNA face down), 
for 2-3 sec to distribute the DNA evenly within the spots, and then snap dried on a hot 
plate (DNA side face up). The DNA was cross-linked to the slides by UV uradiation (60- 
65mJ; 2400 Stratalinker, Stratagene, La Jolla, California, USA). A blocking step was 
performed to modify remaining free lysine groups, and hence minimize their ability to 

10 bind labeled probe DNA. To achieve this, the arrays were placed in a slide rack. An 
empty slide chamber was left ready on an orbital shaker. The rack was bent slightly 
inwards in the middle, to ensure the slides would not run into each other while shaking. 
The blocking solution was prepared as follows: 

Three 350-ml glass chambers (with metal tops) were set to one side, and a large 

15 round Pyrex dish with dHiO was placed ready in the microwave. At this time, 15ml 
sodium borate was prepared in a 50 ml conical tube. 6 g succinic anhydride was 
dissolved in about 325-350 mL l-methyl-2-pyrrolidinone. Rapid addition of reagent was 
important. Immediately after the last flake of the succinic anhydride dissolved, 15-mL 
sodium borate was added. Immediately after the sodium borate solution mixed in, the 

20 solution was poured into an empty slide chamber. The slide rack was plunged rapidly and 
evenly in the solution and was vigorously shaken up and down for a few seconds, making 
sure slides never left the solution. It was mixed on an orbital shaker for 15-20 min. 
Meanwhile, the water in the Pyrex dish (enough to cover slide rack) was heated to 
boiling. Following this, the slide rack was gently plunged into 95 °C water for 2 mm. 

25 The slide rack then was plunged 5times in 95% ethanol. The slides and rack were 
centrifuged for 5 min. at 500 rpm. Slides were loaded quickly and evenly onto the 
carriers to avoid streakmg, and were used immediately or were stored in a slide box. 

Hybridization: The hybridization process began with the isolation of mRNA from 
the two tissues followed by then: conversion to single stranded cDNA (see "Generation of 

30 probes for hybridization "] below). The cDNA from each tissue was independently 

labeled witii a different fluorescent dye and then both samples were pooled together. This 
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final differentially labeled cDNA pool was then placed on a processed microarray and 
allowed to hybridize (see ''Hybridization and wash conditions", below). 

Preparation of Yeast control mRNA: Plasmid DNA was isolated from the 
following yeast clones using Qiagen filtered maxiprep kits (Qiagen, Valencia, California): 

5 YAL022c(Fiin26), YAL031c(Fun21), YBR032w, YDLlSlw, YDL182w, YDL194w, 
YDL196W, YDROSOc and YDRl 16c. Plasmid DNA was linearized with either BsrBl 
(YAL022c(Fun26), YAL031c(Fun21), YDLlBlw, YDL182w, YDL194w, YDL196w, 
YDROSOc) or Afim (YBR032w, YDRl 16c). 

The following solution was incubated at 37 ""C for 2 hours: 17 lal isolated yeast 

10 msert DNA (1 jig), 20 id 5X buffer, 10 pi 100 mM DTT, 2.5 \}1 (100 U) RNasin, 20 \A 
2.5 mM (ea.) rNTPs, 2.7 jd (40U) SP6 polymerase and 27,8 \xL RNase-free deionized 
water. Two \xl (2 U) Ampli DNase I was added and the incubation contiaued for another 
15 min. Ten pi 5MNH4OAC and 100 jil phenol:chlorofonn:isoamyl alcohol (25:24:1) 
were added, and the solution was vortexed and centrifuged to separate the phases. To 

1 5 precipitate the RNA, 250 |il ethanol was added and the solution was incubated at -20 °C 
for at least one hoxxr. The sample was then centrifuged for 20 min. at 4 ""C at 14,000- 
1 8,000 X g, the pellet was washed with 500 \il of 70% ethanol, air dried at room 
temperature for 10 min. and resuspended in 100 |li1 of KNase-firee deionized water. The 
precipitation procedure was repeated one time. 

20 Alternatively, after the two-hour incubation, the solution was extracted with 

phenol/chloroform once before adduig 0.1 volume 3M sodium acetate and 2.5 volumes of 
100% ethanol. The solution was centrifuged at 15,000rpm, 4 °C for 20 min. and the 
pellet resuspended in RNase-firee deionized water. The DNase I treatment was carried out 
at 37^C for 30 min. using 2 U of Ampli DNase I in the following reaction condition: 50 

25 mM Tris-HCl (pH 7,5), 10 mM MgCk. The DNase I reaction was then stopped with the 
addition of NH4OAC and phenol:chloroform:isoamyl alcohol (25:24:1), and RNA 
isolated as described above. 

0.15-2.5 ng of the in vitro transcript RNA firom each yeast clone was added to 
each plant mRNA sample prior to labeling to serve as positive (mtemal) probe controls. 
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Generation of labeled probes for hybridization from first-stran d cDNA: 
Hybridization probes were generated from isolated mKNA using an Atlas™ Glass 
Fluorescent Labeling Kit (Clontech Laboratories, Inc., Palo Alto, CaUfomia, USA). This 
entails a two step labeling procedure that first incorporates primary aliphatic amino 

6 groups during cDNA synthesis and then couples fluorescent dye to the cDNA by reaction 
with the amino functional groups. Briefly, 5 \ig of oligo(dT)i8 primer 
d(TTTTTTTTTTTTTTTTTTV^ was mixed with Poly A+ mRNA (1.5-2 |Lig mRNA 
isolated using the Qiagen Oligotex mRNA Spin-Column protocol or4he Stratagene 
Poly(A) Quik mRNA Isolation protocol (Stratagene, La JoUa, California, USA) m a total 

10 volume of 25 |xl- The sample was mcubated m a thermocycler at 70*^C for 5 min., cooled 
to 48 ^'C and 10 nl of 5X cDNA Synthesis Buffer (kit si5)plied), 5 ^il lOX dNTP mix 
(dATP, dCTP, dGTP, dTTP and aminoallyl-dUTP; kit supplied), 7.5 mJ deionized water 
and 2.5 |Lil MMLV Reverse Transcriptase (500 U) added. The reaction was then 
incubated at 48 °C for 30 min., followed by a 1 hr incubation at 42 ''C. At the end of the 

15 incubation, the reaction was heated to 70 °C for 10 min., cooled to 37 °C and 0.5 \i\ (5 U) 
RNase H added, before incubating for 15 min. at 37 ''C. The solution was vortexed for 1 
min after the addition of 0.5 p.1 0.5 M EDTA and 5 \x\ of QuickClean Resin (kit supplied) 
then centrifiiged at 14,000-18,000 X g for 1 min. After removing the supernatant to a 
0.45 |jm spm filter Odt supplied), the sample was again centrifiiged at 14,000-1 8,000 X g 

20 for 1 min., and 5.5 (il 3 M sodium acetate and 137.5 pi of 100% ethanol added to the 
sample before incubating at -20 ^C for at least 1 hr. The sample was then centrifiiged at 
14,000-1 8,000 X g at 4 °C for 20 mm., the resulting pellet washed with 500 \j1 70% 
ethanol, aur-dried at room temperature for 10 min. and resuspended in 10 iiil of 2X 
fluorescent labeling buffer (kit provided). 10 ^il each of the fluorescent dyes Cy3 and 

25 Cy5 (Amersham Pharmacia , Piscataway, New Jersey, USA); prepared according to 
Atlas™ kit directions of Clontech) were added and the sample incubated in the dark at 
room temperature for 30 min. to 1 hr. 

The fluorescently labeled first strand cDNA was precipitated by adding 2 |al 3M 
sodium acetate and 50 pi 100% ethanol, mcubated at -20**C for at least 2 hrs, centrifiiged 

30 at 14,00048,000 Xg for 20 mia, washed with 70% ethanol, air-dried for 10 niin. and 
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dissolved in 100 (il of water. Alternatively, 3-4 \xg mRNA, 2,5 (-8.9 ng of in vitro 
translated mRNA) |j,1 yeast control and 3 |ig oligo dTV 

^jYrTTTT]TrTTTTrriT(A/C/G); Sequence ID No.: X) were mixed in a total volume 
of 24.7 \iL The sample was incubated in a thermocycler at 70°C for 10 min. before 
5 chilling on ice. To this, 8 |xl of 5X fu-st strand buffer (Superscript II RNase H- Reverse 
Transcriptase kit firom Invitrogen, Carlsbad, California 92008; cat no. 1 8064022), 0.8 JliI 
of aa-dUTP/dlSfTP mix (SOX; 25mM dATP, 25mM dGTP, 25mM dCTP, 15mM dTTP, 
lOmM ammoallyl-dUTP), 4 ^1 of 0.1 M DTT and 2.5 |il (500 U) of Superscript R.T.n 
enzyme (Stratagene) were added. The sample was incubated at 42 °C for 2 hours before a 

10 10 "^C mixture of IM NaOH and 0.5 M BDTA was added. After a 15 minute incubation 
at 65 "^C, 25 |li1 of 1 M Tris pH 7.4 was added. This was mixed with 450 ^1 of water in a 
Microcon 30 column before centrifiigation at 1 1,000 X g for 12 nun. The column was 
washed twice with 450 \xl (centrifiigation at 1 1,000 g for 12 min.) before eluting the 
sample by inverting the Microcon colxmm and centrifiiging at 1 1 ,000 X g for 20 seconds. 

15 Sample was dehydrated by centrifiigation under vacuum and stored at -20 °C. 

Each reaction pellet was dissolved in 9 ^1 of 0. 1 M carbonate buffer (0. 1 M 
sodium carbonate and sodium bicarbonate, pH=8.5-9) and 4.5 |il of this was placed in two 
microfiige tubes. 4.5 ^1 of each dye (in DMSO) was added, and the mixture was 
incubated in the dark fcM: 1 hour. 4.5 |ud of 4 M hydroxylamme was added and the mixture 

20 was again incubated in the dark for 1 5 min. 

Irrespective of the method used for probe generation, the probe was piirified using 
a Qiagen PCR cleanup kit (Qiagen, Valencia, CaUfomia, USA), and eluted with 100 ul 
EB (kit provided). The sample was loaded on a Microcon YM-30 (Millipore, Bedford, 
Massachusetts, USA) spin column and concentrated to 4-5 ul in volxmie. Probes for the 

26 maize microarrays were generated using the Fluorescent Linear Amplification Kit (cat. 
No. G2556A) firom Agilent Technologies (Palo Alto, CA). 

Hybridization Conditions : Labeled probe was heated at 95 ""C for 3 min. and was 
then chilled on ice. Then, 25 jxl of the hybridization buffer which was warmed at 42 °C 
was added to the probe and was mixed by pipetting to give a final concentration of: 50% 

30 fonnamide, 4x SSC, 0.03% SDS, 5x Denhardt's solution, and 0.1 jig/ml single-stranded 
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salmon sperm DNA. The probe was kept at 42 °C. Prior to hybridization, the probe was 
heated for 1 min., added to the array, and then covered with a glass cover slip. Slides 
were placed in hybridization chambers (Telechem International, Sunnyvale, California) 
and incubated at 42 ^^C overnight. 

5 Washing conditions : Slides first were washed in Ix SSC + 0.03% SDS solution 

at room temperature for 5 min. Slides then were washed in 0.2x SSC at room temperature 
for 5 min. Slides finally were washed in 0.05x SSC at room temperature for 5 min. 
Slides then were spun at 800 xg for 2 min. to dry. They were then scanned. 

Scanning of slides: Chips were scanned usmg a ScanArray 3000 or 5000 (General 

10 Scanning, Watertown, Massachusetts, USA). The chips were scanned at 543 nm and at 
633 nm at a resolution of 10 pm to measure the intensity of the two fluorescent dyes 
incorporated into the samples hybridized to the chips. 

Data extraction and analysis: The images generated by scanning slides consisted 
of two 16-bit TIFF images representing the fluorescent emissions of the two samples at 

15 each arrayed spot. These images were quantified and processed for expression analysis 
using Imagene™ (Biodiscovery, Los Angeles, California, USA) data extraction software. 
Imagene™ output was using the Genespring™ (Silicon Genetics, San Carlos, California, 
USA) analysis software. In Genespring ™, the data was imported using median pixel 
intensity measurements derived from Imagene™ output. Ratio calculation and 

20 normalization were conducted using Genespring™. Normalization was achieved by 
parsing the data into 32 groups, each of which represented one of the 32 pin printing 
regions on the microarray. Each group consisted of about 360 to 550 spots, and was 
independently normalized by setting the median of ratios to one and multiplying ratios by 
the appropriate factor. 

25 Results : Among the ten thousand genes rq)resented on the DNA chip, the 

expression ratio of 1 52 genes (1 .52%) was found to be reduced at least 2-fold m Fi (J2592 
X FE #1) floral tissue when compared to floral tissue from FE #1. Similarly, the 
expression ratios of 63 genes (0.63%) were found to be down at least 2-fold in Fi (J2592 
X FE #1) floral tissue when compared to floral tissue from J2592. By contrast, the 

30 expression ratio of 227 genes (2.27%) was increased more than 2-fold in Fi (J2592 x FE 
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#1) floral tissue when compared to floral tissue from FE #1 . Similarly, 39 genes (0.39%) 
were found to be up at least 2-fold in Fi (J2592 x FE #1) compared to J2592 floral tissue. 

Example 11: Analysis ofFIE-lSG-ESAl Activated Plants 

6 The FIE- 1 5G-ES Al transgene is transcriptionally activated by crossing 

female FE plants containing a FIE-15G-ESA1 transgene to enhancer trap HAPl- 
VP16 lines that display cell and tissue specific GFP accumulation m vegetative and 
reproductive organs. FE plants are crossed with four different activation lines. A 
different enhancer is present in each of the lines and confers expression of the 

1 0 HAP 1 -VP 1 6 transcription activator, as well as the GFP, m a different set of tissues. 
The amino acid sequence of the HAPl portion of the HAP1-VP16 transcription 
activator is that of the yeast HAPl gene. The activity of each enhancer-trap line is 
inferred from the GFP fluorescence. 

At maturity, Fi seeds are collected and stored under standard conditions. A 

15 reciprocal cross is also made, in which FE plants are used as males. 

Fi seeds are germinated and allowed to self-pollinate. After pollination, some 
of the embryos and seeds developing on Fi plants are examined under a microscope. 
Mature seed also are analyzed as described in Example 7. Seedlings are scored for 
GFP expression and tested for the presence of FIE-15G-ESA1 by PCR. Phenotypic 

20 traits are analyzed as described in Example 7. 
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Table 6: HAT Polypeptide Sequences 



Arabldopsxs ESAl-like 

MGSSANTETNGNAPPPSSNQKPPATNGVDGSHPPPPPLTPDQAIIESDPSKKRKMGMLPL 
EVGTRVMCRWRDGKHHPVKVIERRRIHNGGQNDYEYYVHYTEFNRRLDEWTQLDQLDLDS 
VECAVDEKLEDKVTSLKMTRHQKRKIDETHIEGHEELDAASLREHEEFTKVKNISTIELG 
KYEIETWYFSPFPPEYNDCVKLFFCEFCLNFMKRKEQLQRHMXKCDLKHPPGDEIYRSGT 
LSMFEVDGKKNKVYAQNLCYLAKLFLDHKTLYYDVDLFLFYVLCECDDRGCHMVGYFSKE 
KHSEEAYNLACILTLPSYQRKGYGKFLIAFSYELSKKEGKVGTPXKTLVGSRLTKLQRLL 
DSCSIRNLEKT 

Maize HAC000003 

MDSHSSHLNAANRSRSSQTPSPSHSASASVTSSLHKRKLAATTAANAAASEDHAPPSSSF 
PPSSFSADTRDGALTSNDELESISARGADTDSDPDESEDIVVDDDEDEFAPEQDQDSSIR 
TFTAARLDSSSGVNGSSRNTKLKTESSTVKLESSDGGKDGGSSVVGTGVSGTVGGSSISG 
LVPKDESVKVLAENFQTSGAYIAREEALKREEQAGRLKFVCYSNDSIDEHMMCLIGLKNI 
FARQLPNMPKEYIVRLLMDRKHKSVMVLRGNLVVGGITYRPYHSQKFGEIAFCAITADEQ 
VKGYGTRLMNHLKQHARDVDGLTHFLTYADNNAVGYFVKQEIPQSFTSKSSVSTLSYQGF 
TKEIYLEKDVWHGFIKDYDGGLLMECKIDPKLPYTDLSSMIRQQRKAIDERIRELSNCQN 
VYPKIEFLKNEAGIPRKIIKVEEIRGLREAGWTPDQWGHTRFKLFNGSADMVTNQKQLNA 
LMRALLKTMQDHADAWPFKEPVDSRDVPDYYDIIKDPIDLKVIAKRVESEQYYVTLDMFV 
ADARRMFNNCRTYNSPDTIYYKCATRLETHFHSKVQAGLQSGAKSQ 

Arabldopsxs HATl 

MSVHVKEEPVLVPNCDVENTELAVFNGNGESELENFGTCVDEITDRVNQLEQKVVEVEHF 
YSTKDGAAQTNTSKSNSGGKKIAISQPNNSKGNSAGKEKSKGKHVSSPDLMRQFATMFRQ 
lAQHKWAWPFLEPVDVKGLGLHDYYKVIEKPMDLGTIKKKMESSEYSNVREIYADVRLVF 
KNAMRYNEEKEDVYVMAESLLEKFEEKWLLIMPKLVEEEKKQVDEEAEKHANKQLTMEAA 
QAEMARDLSNELYEIDLQLEKLRESVVQRCRKLSTQEKKGIiSAALGRLSPEDLSKALKMV 
SESNPSFPAGAPEVELDIDVQTDVTLWRLKVFVQEALKAANKSSGGTNAQNNNNTGTGEI 
NKNNAKRRREISDAINKASIKRAKKA 



60 



wo 03/000715 



PCT/US02/19750 



Table 7: CAP/HDAC Gene and Polypeptide Sequences 



MEA, 61:3089625 

MEKENHEDDG EGLPPELNQI KEQIEKERFL HIKRKFELRY IPSVATHASH HQSFDLNQPA AEDDNGGDNK SLLSBMQNPL 
6 RHFSASSDYN SYEDQGYVLD EDQDYALEED VPLFLDEDVP LLPSVKLPIV EKLPRSITWV FTKSSQLMAE SDSVIGKRQI 
YYIiNGEALEL SSEEDEEDEE EDEEEIKKBK CEFSEDVDRF IWTVGQDYGL DDLWRRAIA KYLEVDVSDI LERYNEIiKLK 
NDGTAGEASD LTSKTITTAF QDFADRRHCR RCMIFDCHMH EKYEPESRSS EDKSSLFEDE DRQPCSEHCY LKVRSVTEAD 
HVMDNDNSIS NKIWSDPNN TMWTPVEKDL YLKGIEIFGR NSCDVALNIL RGLKTCLEIY NYMREQDQCT MSLDLNKTTQ 
RHNQVTKKVS RKSSRSVRKK SRLRKYARYP PALKKTTSGE AKFYKHYTPC TCKSKCGQQC PCLTHENCCE KYCGCSKDCN 
10 NRFGGCNCAI GQCTNRQCPC FAANRECDPD LCRSCPLSCG DGTLGETPVQ IQCKNMQFLL QTNKKILIGK SDVHGWGAFT 
WDSLKKNEYL GEYTGELITH DEANERGRIE DRIGSSYLFT LNDQLEIDAR RKGNEFKFLN HSARPNCYAK LMIVRGDQRI 
GLFAERAIEE GEBLFFDYCY GPEHADWSRG REPRKTGASK RSKEARPAR 



15 PIS2, 61:4185501 

MTLKAEWEN FSCPFCLIPC GGHEGLQLHL KSSHDAFKFE FYRAEKDHGP EVDVSVKSDT IKFGVLKDDV GNPQLSPLTF 
CSKNRNQRRQ RDDSNNVKKL NVLLMELDLD DLPRGTEND5 THVNDDNVSS PPRAHSSEKI SDILTTTQLA lAESSEPKVP 
HVNDGNVSSP PRAHSSAEKN ESTHVNDDDD VSSPPRAHSL EKNESTHVNE DNISSPPKAH SSKKNESTHM NDEDVSFPPR 
TRSSKETSDI LTTTQPAIVB PSEPKVRRGS RRKQLYAKRY KARETQPAIA ESSEPKVLHV NDENVSSPPE AHSLEKASDI 

20 LTTTQPAIAE SSEPKVPHVN DENVSSTPRA HSSKKNKSTR KNVDNVPSPP KTRSSKKTSD ILTTTQPTIA ESSEPKVRHV 
NDDNVSSTPR AHSSKKNKST RKNDDNIPSP PKTRSSKKTS NILTRTQPAI AESEPKVPHV NDDKVSSTPR AHSSKKNKST 
HKKDDNASLP PKTRSSKKTS DILATTQPAK AEPSEPKVTR VSRRKELHAE RCEAKRLERL KGRQFYHSQT MQPMTFEQVM 
SNEDSENETD DYALDISERL RLBRLVGVSK EEKRYMYLWN IFVRKQRVIA DGHVPWACEE FAKLHKEEMK NSSSFDWWWR 
MFRI!a.WNNG LICAKTFHKC TTILLSNSDE AGQFTSGSAA NANNQQSMEV DE 

25 

FIE, 61:4567095 

MSKITLGNES IVGSLTPSNK KSYKVTNRIQ EGKKPLYAW FNFLDARFFD VFVTAGGNRI TLYNCLGDGA ISALQSYADE 
DKEESFYTVS WACGVNGNPY VAAGGVKGII RVIDVNSETI HKSLVGHGDS VNEIRTQPLK PQLVITASKD ESVRLWNVET 
30 GICILIFAGA GGHRYEVLSV DFHPSDIYRF ASCGMDTTIK IWSMKEFWTY VEKSFTWTDD PSKFPTKFVQ FPVFTASIHT 
NYVDCNRWFG DFILSKSVDN EILLWEPQLK ENSPGBGASD VLLRYPVPMC DIWFIKFSCD LHLSSVAIGN QEGKVYVWDL 
KSCPPVLITK LSHNQSKSVI RQTAMSVDGS TIIiACCEDGT IV?RWDVITK 



Multi sex combs (mxc), AT5g46250 

35 MESPSISDAVPLHAPEDATADFSQPQSPLHEVDSFPVTESSDDWVNVSEIPNLSPSDDDFDHERNSGEDRDQDHGENPVETDGVWPID 
ELNQKIIRQVEYYFSDENLPTDKFLLNAMKRNKKGFVPISTIATFHKMKKLTRDHALIVSALKESSFLVVSADEKKVE»LSPL^^ 
IFTVLVENIjPEDHSNENIREIFGKAGSIKSVSICDPNAVEESEKGGKKBNFIRTRLHAFVEYETVEAAEKAAATLNNEQDWRNGLRVKLL 

eqaagkhtiqrrparrbvdkekdttgrvhdcyrggbknkktrehqnhrlhhsdnpadddggnhqkdkngnkgrwgqgrrqn 
tasssshpnyhpvevskrppgprmpdgtrgftmgrgkaippptstqtshev 

40 



Arabidposis TSOl-like, 61:7767427 

MDTPEKSETQ IGTPVSKLKV EDSPVFSYIC NLSPIKTIKP IPITCPliSSL nyasppsvft sphavshkes rfrsqkdvsa 
45 SKEVGEEEAL VGSEPEQSYK NDCNTPRVLN DVKDNGCGKD LQVMMDNVKK KSDTPDWETL IAATTELIYG SPRESEAFSC 
LLKKTSNSEA RLRGSITATS VAVTNTDVVN NESESVDALS ILHRGVRRRC LDFEVKGNNQ QTLGESSSSC wpsiglhln 
TIAMSSKDKN VANEYSFSGN IKVGVQSSLT PVLHSQHDIV RENESGKDSG QIIEWPKSL ASVDLTPISP KKKRRKSEQS 

gegdssckrc nckkskclkl ycecfaagfy ciepcscinc fnkpihkdw latrkqiesr nplafapkvi rnsdsiievg 

EDASKTPASA RHKRGCNCKK SNCLKKYCEC YQGGVGCSIN CRCEGCKNAF GRKDGSLFEQ DEENETSGTP GTKKTQQNVE 
50 LFKPAAPPST PIPFRQPLAQ lpxssnnrll ppqshfhhga igssssgiyn IRKPDMSLLS HSRIETITED iddmsenlih 
SPITTLSPNS KRVSLSHLDS PESTPWRRNG EGRNLIRSFP TFPSLTPHH 



Sin3, P3I6.12 

55 MVGGGSAQKLTTNDALAYLKAVKDKFQDQRGKYDEFLEVMKNFKSQRVDTAGVITRVKELFKGHQELILGFNTFLPKGFEITLQPEDGQP 
PLKKRVEFEEAISFVNKIKTRFQGDDRVYKSFLDILNMYRRDSKSITEVYQEVAILFRDHSDLLVEFTHFLPDTSATASIPSVKTSVRER 
GVSLADKKDRIITPHPDHDYGTEHIDQDRERPIKKENKEHMRGTNKENEHRDARDFEPHSKKEQFLNKKQKLHIRGDDPAEISNQSKIiSG 
AVPSSSTYDEKGAMKSYSQDLAIVDRVKEKLNASEYQEFLRCLNLFSKEIISRPELQSLVGNLIGVYPDLMDSFIEFLVQCEKNEKRQIC 
NLLNLIAEGLLSGILTKKSLWSEGKYPQPSLDNDRDQBHKRDDGLRDRDHEKERLEKAAANLKW;^^ 

60 piSIASQKTEIGKLVLNDHWSVTSGSEDYSFSHMRKNQYEESLFKCEDDRFELDMIJiESVNSTTKHVBELLTKINSNEMOTNSPI 
HLTALNLRCIERLYGDHGLDVMDVIJCKlWSLALPVILTRLKQKQEE^mRCRSDFDK^ 

IJ^IKEITEKKREDDSLIAFAAGNRLSISPDLEFDYPDHDLHEDLYQiaKYSCAEMCSTEQLDKVMKIWTT^^ 

DWKSMNQNVKSGSSSAGESEGSPHiryASVADSRRSKSSRKANBHSQI.GQTSNSERDGAAGRTSDALCETAQHEKMIiKNWTS 

QAVSIBRAHDSTAIAVDGIJJSQSNGGSSIVHMTGHCNiraLHCPVTCGTBIiELKM^ 
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REEGELSPNGDFEEDNFAVYAKTDFETFSKANDSTGNNISGDRSREGEPSCLETRAENDAEGDENAARSSEDSRNEYENGDVSGTESGGG 
EDPEDDLDNNNKGESEGEABCMJUiAHDAEENGSALPVSARFLLHVKPLVKYVPSAIALHDKDKDSLKNSQVFYGNDSFYVLFiaHRILYE 
RILSAKVNSSSPEGKWRTSNTKNPTDSYARFMTALYNLLDGTSDNAKFEDDCRAIIGTQSYILFTLDKLIHKFIKHLQWVADEMDNKIiL 
QLYFYEKSRRPETIFDAVYYDNTRVLLPDENIYRIECRLSTPAKLSIQLMCNGLDKPDVTSVSIDPTFAAYLHNDFLSIQPNAREDRRIY 
5 LNR 



Sin3, GI: 2829870 

MVGGGSAQKL TTNDALAYLK AVKDKFQDQR GKYDEFLEVM ECNFKSQRVDT AGVITRVKEL FKGHQELILG FNTFLPKGFE 
ITLQPEDGQP PLKKRVEFBE AISFVNKIKT RFQGDDRVYK SFLDILNMYR RDSKSITEVY QEVAILFRDH SDLLVEFTHF 

10 LPDTSATASI PSVKTSVRER GVSLADKKDR IITPHPDHDY GTEHIDQDRE RPIKKENKEH MRGTNKEMEH RDARDFEPHS 
KKEQFLNKKQ KLHIRGDDPA EISNQSKLSG AVPSSSTYDE KGAMKSYSQD LAIVDRVKEK LNASEYQEFL RCLNLFSKBI 
ISRPELQSLV GNLIGVYPDL MDSFIBFLVQ CBKNEKRQIC NIiLNIiLAEGIi LSGILTKKSL WSEGKYPQPS LDNDRDQEHK 
RDDGLRDRDH EKERLEKAAA NLKWAKPISE LDLSNCEQCT PSYRLLPKNY PISIASQKTE IGKLVLNDHW VSVTSGSEDY 
SFSHMRKHQY EESLFKCBDD RFELDMLLES VNSTTKHVEE LLTKINSNEL KTNSPIRVED HLTALNLRCI ERLYGDHGLD 

15 VMDVIiKKNVS LALPVILTRL KQKQEEWARC RSDFDKVWAE lYAKNYYKSL DHRSFYFKQQ DSKKLSMKAL LAEIKEITEK 
KREDDSLLAF AAGNRLSISP DLEFDYPDHD LHEDLYQLIK YSCAEMCSTE QLDKVMKIWT TFVEQIFGVP SRPQGAEDQE 
DWKSMNQNV KSGSSSAGES EGSPHNYASV ADSRRSKSSR KANEHSQLGQ TSNSERDGAA GRTSDALCET AQHEKMLKNV 
VTSDBKPESK QAVSIERAHD STALAVDGLL DQSNGGSSIV HMTGHCNNNL KPVTCGTELE LKMNDGNGPK LEVGNKKLLT 
NGIAVEITSD QEMAGTSKVE REEGELSPNG DFEEDNFAVY AKTDFETFSK ANDSTGNNIS GDRSREGEPS CLETRAENDA 

20 EGDENAARSS EDSRNEYENG DVSGTBSGGG EDPEDDLDNN NKGESEGEAE CMADAHDAEE NGSALPVSAR FLLHVKPLVK 
YVPSAIALHD KDKDSLKNSQ VFYGNDSFYV LFRLHRILYE RILSAKVNSS SPEGKWRTSN TKNPTDSYAR FMTALYNLLD 
GTSDNAKFED DCRAIIGTQS YILFTLDKLI HKFIKHLQW VADEMDNKIL QLYFYEKSRR PETIFDAVYY DNTRVLLPDE 
NIYRIECRLS TPAKLSIQLM CNGLDKPDVT SVSIDPTFAA YLHNDFLSIQ PNAREDRRIY LNR 

25 

Arabidopsis lteCP2, 61:2827551 

MNLKKSRSEN SSVASSGSKI EEQTEKSAEP TTIKVQKKAG TPGRSIDVFA VQCEKCMKWR KIDTQDEYED IRSRVQBDPF 
FCKTKEGVSC BDVGDLNYDS SRTWVIDKPG LPRTPRGFKR SLILRKDYSK MDAYYITPTG KKLKSRNEIA AFIDANQDYK 
YALLGDFNFT VPKVMBETVP SGILSDRTPK PSRKFLSGKM QGGGGRDPFG GGFGGPFGGF GGGSFGGFGR GSFGGFGGPN 
30 GPPSLMSNFF GGRDPFDDPF FTQPFGGGMF QSNFFGPSMN PFAEMHRLPQ GFIENNQPPG PSRSRGPVIE EIDSDDEKEG 
EGDKEKKGSL GKHGRSSSEA ETEDARVRER RNRQMQSMNV NTiERRNREMQ NMNVNAERRN PQMQNMNVNA MVNNGQWQPQ 
TGSYSFQSST VTYGGQNGNY YTSSKTRRTG SDGGHTVARK LNSDGRVDTT QTLHNLNEGG LVNREQPMLL PSTDPSPSHA 
RAESSRRPKA AMNLIPILAI AVASAAFLSE LVSMSLPESI WRMMTPKAKI SVFSVNFPW TYSGAKYPIA ALVLSAEKTI 
SSARRL 

35 



com HBDl, GI: 13936238 

MTTGSTPGSAPSQRKRHSTKDSVALYAVQCYKCYIWSTVPKEBFETLRENFTKDPWFipSRRPDSSCEDDADlEYDSSRIl^ 
PPBTERLVVMRGDYSKMDTYYVMPNGKRARCAGDVDKFLEANPEYKDRISASDFSFAPPKVVEETVSHNPAWQAAKAKKQEKTV^ 

40 



Com MBDl, 61:13936310 

MPAPDGWTKKFTPQRGGRSEIVFVSPTGEEIKNKRQLSQYLKAHPGGPAASDFDViGTGDTPRRSARISEKVKVFDSPEGEKIPKRSRNSS 
GRKGRQGKKEAPETEEAKDAETGQDAPSEDGTKETDVBMKPAEEAKEAPTETDDAEKAADKADDTPAPAPMEBDEKETEKPAEAWAPLA 
45 QSEEKKEDAKPDEPEAVAPAPVSNPTENSAPAPAEPAAVPAPVPETESVAEPAAVLAPAPETKPDAKPAAVPAPAPENKPDABPAAAAAP 
VPDTKSVAEPAAAPAPDTKSVABPAAAAPVPETKLVAESAADAVAAPAPETKSDAEPAAAPVPETKPVAESAADAVAAPAPETKSDAEPA 
AAADPAPBIKSDAAAADPAPGTKADAAATDAAPGAEPDAAPLBNTAADKGGSEESSQPVNNVNNGHST 



50 Rice MBDl 

EITVEESKEAPTTTKEATHRISRGIHDKGHSLTRKLKSDGNVDTTQILHNLHEDELAGFEESWKGNARHHLAGLNQNAGTSNNNNQVTVA 
PVDVAGNPLGVGLFLEESKAVIKDGTSEDRNHVSYQSPKGFLLYIYGSKSVNCXWESSKIQVQRILI 



55 Arabidopsis MBDl, GI: 9392683 

mddgdlgnnh hnflggagnr Isaeslplid trllsqselr alsqcsslsp sssaslaasa ggdddltpki drsvfnesag 
srkqtflrlr larhpqppee ppspqrqrdd ssreeqtqva sllrslfnvd snqskeeede geeeledneg qihynsyvyq 
rpnldsiqnv liqgtsgnki krlcrgrprki rnpseenevl dltgeastyv fvdktssnlg mvsrvgssgi sldsnsvkrk 
rgrppknkee." ironlekrdsa ivnisafdke elwnlenre gtivdlsala svsedpyeee Irritvglkt keeilgfleq 

60 Ingewvnigk kkkwnacdy ggylprgwrl mlyikrkgsn lllacrryis pdgqqfetck evstylrsll espsknqhyy 
Iqsdnktlgq qpvianesll gnsdsmdset mqylesgrts sevfeeakav engneadrvk tslmqkddna dflngvednd 
ddmkkrdgnm enlatlsnse mtkslptttn elqqyf ssqi nrvq 
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Table 8: Amino Acid Sequence of Gal4-VPl6 Transcriptional 

Activator 





aagcttggatccaaca 


atg 
M 


aag 
K 


etc 
L 


ctg 
L 


cgc 
R 


etc 
L 


aag 
K 


aag 
K 


etc 
L 


aag 
K 


tge 
C 


tec 
S 


aag 
K 


gag 
E 


aac 
N 


tgg 
n 


gag 

B 


tgt 
c 


cgc 
R 


tac 
y 


tct 
s 


ccc 

p 


aaa 

K 


acc 
T 


acc 
T 


gaa 
E 


gtg 
V 


gag 
E 


tec 

s 


cgc 

R 


ctg 
L 


gag 
B 


cgc 
R 


ctg 
L 


gag 
E 


gac 
D 


etc 
L 


gac 
D 


atg 
M 


ate 
I 


ctg 
L 


aaa 
K 


atg 
M 


gac 

D 


ggc 
6 


etc 
L 


tte 
F 


gtc 
V 


cag 
Q 


gac 
D 


N 


gtg 
V 


N 


K 


gag 
E 


acc 
T 


gac 
D 


atg 
M 


ccc 
p 


etc 
L 


acc 
T 


ctg 
L 


cgc 

R 


cag 
Q 


gag 
G 


age 

s 


age 
S 


aac 
K 


aag 
K 


ggc 

G 


cag 
Q 


cgc 
R 


cag 
Q 


ttg 
L 


age 
S 


ctg 
L 


ggg 

6 


gac 
D 


gag 
E 


etc 

L 


cac 
H 


tta 
L 


gac 
D 


ggc 

G 


eta 
L 


gac 
D 


gat 

D 


ttc 
F 


gat 
D 


ctg 
L 


gac 
D 


atg 
M 


ttg 
I. 


ggg 

G 


ccc 
P 


cac 
H 


gac 
D 


tec 
s 


gee 
A 


ccc 

p 


tac 

Y 


ggc 
G 


get 
A 


ctg 
L 


ttt 
F 


acc 
T 


gat 
D 


gee 
A 


ctt 
L 


gga 

G 


att 
z 


gac 
D 


gag 
E 


tac 
Y 



tec tee ate gag cag gee tgc gac ate tge 
SSIEQACDIC 

aag ccg aag tgc gcc aag tgt ctg aag aac 
KPKCAKCLKN 

aag cgc tec ccg ctg acc cgc gee cac etc 
KRSPLTRAHL 

gag cag etc ttc etc ctg ate ttc cet cga 
EQLFLLIFPR 

tec etc cag gac ate aaa gcc ctg etc acc 
SLQDIKALliT 

gac gcc gtc acc gac cgc ctg gcc tec gtg 
DAVTDRLASV 

cac cgc ate age gcg acc tec tec teg gag 
HRISATSSSE 

acc gtc teg acg gcc ccc ccg acc gac gtc 
TVSTAPPTDV 

gag gac gtg gcg atg gcg cat gcc gac gcg 
EDVAMAHADA 

gac ggg gat tec ccg ggg ccg gga ttt acc 
DGDSPGPGFT 

gat atg gcc gac ttc gag ttt gag cag atg 
DMADFE FEQM 

ggt ggg tagatct 
G G * 
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Table 9: Amino Acid Sequence of FIE-15G-ESA1 Polypeptide 



MSKITLGNESIVGSLTPSNKKSYKVTNRIQEGKKPLYAWFNFLDARFFDVFVTAGGNRITLYNCLGDGAISALQSYADEDKEESFYTVS 

WACGVNGNPYVAAGGVKGIIRVIDVNSETIHKSLVGHGDSVNEIRTQPLKPQLVITASKDESVRLWNVETGICILIFAGAGGHRYEVLSV 

DFHPSDIYRFASCGMDTTIKIWSMKEFWTYVEKSFTWTDDPSKFPTKFVQFPVFTASIHTNYVDCNRWFGDFILSKSVDNEILLWEPQLK 

ENSPGEGASDVLLRYPVPMCDIWFIKFSCDLHLSSVAIGNQEGKVYVWDLKSCPPVLITKLSHNQSKSVIRQTAMSVDGSTILACCEDGT 

IWRWDVITKGSPGGGGGGGGGGGGGGGMRTHIEGHEELDAASLREHEEFTKVKNISTIELGKYEIETWYFSPFPPEYNDCVKLFFCEFCL 

NFMKRKEQLQRHMRKCDLKHPPGDEIYRSGTLSMFEVDGKKNKVYAQNLCYLAKLFLDHKTLYYDVDLFLFYVLCECDDRGCH^ 

EKHSEEAYNLACILTLPSYQRKGYGKFLIAFSYELSKKEGKVGTPXKTLVGSRLTKLQRLLDSCSIRNLEKT 
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Table 10: Nucleotide Sequence of FIE-15G-ESA1 

cggagtactgtcctccgagcggagtactgtcctccgagcggagtactgtcctccgagcggagtactgtcctccgagcggagtactgtcct 
ccgagcggagactctagaacgattatttaggtgataagagtggacaatgatcgttgacacgtggacggtccacaaattctagttttgcct 
5 ataagtatcaaagctgaatgtgtaagttggatccaacaccagttgttgttgcatgagagacttgtgagcttagattagtgtgcgagagtc 
agacagagagagagatttcgaatatcgaatgtcgaagataaccttagggaacgagtcaatagttgggtctttgactccatcgaataagaa 
atcgtacaaagtgacgaataggattcaggaagggaagaaacctttgtatgctgttgttttcaacttccttgatgctcgtttcttcgatgt 
cttcgttaccgctggtggaaatcggattactctgtacaattgtctcggagatggtgccatatcagcattgcaatcctatgctgatgaaga 
taaggaagagtcgttttacacggtaagttgggcgtgtggcgttaatgggaacccatatgttgcggctggaggagtaaaaggtataatccg 

10 agtcattgacgtcaacagtgaaacgattcataagagtcttgtgggtcatggagattcagtgaacgaaatcaggacacaacctttaaaacc 
tcaacttgtgattactgctagcaaggatgaatctgttcgtttgtggaatgttgaaactgggatatgtattttgatatttgctggagctgg 
aggtcatcgctatgaagttctaagtgtggattttcatccgtctgatatttaccgctttgctagttgtggtatggacaccactattaaaat 
atggtcaatgaaagagttttggacgtacgtcgagaagtcattcacatggactgatgatccatcaaaattccccacaaaatttgtccaatt 
ccctgtatttacagcttccattcatacaaattatgtagattgtaaccgttggtttggtgattttatcctctcaaagagtgtggacaacga 

15 gatcctgttgtgggaaccacaactgaaagagaattctcctggcgagggagcttcagatgttctattaagatacccggttccaatgtgtga 
tatttggtttatcaagttttcttgtgacctccatttaagttctgttgcgataggtaatcaggaaggaaaggtttatgtctgggatttgaa 
aagttgccctcctgttttgattacaaagttatcacacaatcaatcaaagtctgtaatcaggcaaacagccatgtctgtcgatggaagcac 
gattcttgcttgctgcgaggacgggactatatggcgctgggacgtgattaccaagggatcccccggaggtggaggtggaggtggaggtgg 
aggtggaggtggaggtggaatgaggacacatatagagggtcatgaagagctggatgcagcaagtttgcgtgaacatgaagagttcacgaa 

20 agtgaagaacatatcaacaattgagcttggaaaatatgagattgagacttggtacttctccccttttccgccagaatacaatgactgtgt 
gaagctctttttttgtgagttttgcctgaacttcatgaaacgcaaagagcagcttcaaaggcatatgagraagtgtgacctgaagcaccc 
acctggtgatgaaatttaccgaagtggtaccttgtcaatgtttgaggtagatggcaaaaagaacaaggtttatgcacagaatctctgcta 
cctggcaaagttatttcttgaccacaaaactctttactacgatgttgatttgtttctattctacgttctttgcgaatgtgatgaccgagg 
atgccacatggttgggtacttttcaaaggagaagcattcggaagaagcatacaacttagcttgcattctaaccctgccttcatatcaaag 

25 aaaaggctatggaaagttcttaatagccttttcctatgaactgtcaaagaaagagggaaaagttgggacaccggraaagacccttgtcgg 
atctaggcttactaagctacagaggttattggactcgtgttctattagaaatcttgaaaaaacataactcgagggggggcccgctagagt 
cctgctttaatgagatatgcgagacgcctatgatcgcatgatatttgctttcaattctgttgtgcacgttgtaaaaaacctgagcatgtg 
tagctcagatccttaccgccggtttcggttcattctaatgaatatatcacccgttactatcgtatttttatgaataatattctccgttca 
atttactgattgtaccctactacttatatgtacaatattaaaatgaaaacaatatattgtgctgaataggtttatagcgacatctatgat 

30 agagcgccacaataacaaacaattgcgttttattattacaaatccaattttaaaaaaagcggcagaaccggtcaaacctaaaagactgat 
tacataaatcttattcaaatttcaaaaggccccaggggctagtatctacgacacaccgagcggcgaactaataacgttcactgaagggaa 
ctccggttccccgccggcgcgcatgggtgagattccttgaagttgagtattggccgtccgctctaccgaaagttacgggcaccattcaac 
ccggtccagcacggcggccgggtaaccgacttgctgccccgagaattatgcagcatttttttggtgtatgtgggccccaaatgaagtgca 
ggtcaaaccttgacagtgacg caaatcgttgggcgggtccagggcgaattttgcgacaacatgtcgaggctcagcag 

35 
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Table 11: Amino Add Sequence of MEA-15G-ESA1 Polypeptide 



MEKBNHEDDGEGLPPELNQIKBQIEKERFMIKRKFELRyiPSVATHASHHQSFDLNQPAAEDDNGGDNKSLLSRMQNPLRHFS^ 
SYEDQGyVLDEDQDYALEEDVPLFLDEDVPLLPSVKLPIVEKlPRSITWVFTKSSQLMAESDSVIGKRQIYYLNGEALELSSEEDEEDEE 
5 EDEEEIKKEKCEFSEDVDRFIWTVGQDYGLDDLWRRALAKYLBVDVSDILERYNELKLKNDGTAGEASDLTSKTITTAFQDFADRRHCR 
RCMIPDCHMHEKYEPESRSSEDKSSLFEDEDRQPCSEHCYLKVRSVTEADHVm)NDNSISNKIVVSDPNNTMWTPVEKDLYLKGIEI 
NSCDVALNILRGLKTCLEIYNYMREQDQCTMSLDLNKTTQRHNQVTKKVSRKSSRSVRKKSRLRKYARYPPALKKTTSGEAKFYKHYTPC 
TCKSKCGQQCPCLTHENCCEKYCGCSKDCNNRFGGCNCAIGQCTNRQCPCFAANRECDPDLCRSCPIiSCGDGTLGETPVQIQCKNMQFLL 
QTNKKILIGKSD^raGWGAFTWDSLKKNEYLGEYTGELITHDEAMBRGRIBDRIGSSYLFTLNDQLEIDARRKGNBFKFLNHSAR^ 
1 0 LMI VRGDQRI GLFAERAI EEGEELFFDYCYGPEHADWSRGREPRKTGASKRSKEARPARGSPGGGGGGGGGGGGGGGMRTHI EGHEELDA 
ASLREHEEFTKVBCNISTIELGKYEIETWYFSPFPPEYNDCVKLFFCEFCLNFMKRKEQLQRHMRKCDLKHPPGDEIYRSGTLSMFEVDGK 
KNKVYAQNLCyiJaCLFLDHKTLYyDVDLFIiFYVLCECDDRGCHMVGYFSKEKHSEEAYKIiACII,TLPSYQRKGYGKFLIAFSyBLSK^ 
KVGTPXKTLVGSRLTKLQRLLDSCSIRNLEKT 

15 
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Table 12: Nucleotide Sequence of MEA-15G-ESA1 



atggagaaggaaaaccatgaggacgatggtgagggtttgccacccgaactaaatcagataaaagagcaaatcgaaaaggagagatttctg 

catatcaagagaaaattcgagctgagatacattccaagtgtggctactcatgcttcacaccatcaatcgtttgacttaaaccagcccgct 

gcagaggatgataatggaggagacaacaaatcacttttgtcgagaatgcaaaacccacttcgtcatttcagtgcctcatctgattataat 

tcttacgaagatcaaggttatgttcttgatgaggatcaagattatgctcttgaagaagatgtaccattatttcttgatgaagatgtacca 

ttattaccaagtgtcaagottccaattgttgagaagctaccacgatccattacatgggtcttcaccaaaagtagccagctgatggctgaa 

agtgattctgtgattggtaagagacaaatctattatttgaatggtgaggcactagaattgagcagtgaagaagatgaggaagatgaagaa 

gaagatgaggaagaaatcaagaaagaaaaatgcgaattttctgaagatgtagaccgatttatatggacggttgggcaggaotatggtttg 

gatgatctggtcgtgcggcgtgctctcgocaagtacctcgaagtggatgtttoggacatattggaaagatacaatgaactcaagcttaag 

aatgatggaactgctggtgaggcttctgatttgacatccaagacaataactactgctttccaggattttgctgatagacgtcattgccgt 

cgttgcatgatattcgattgtcatatgcatgagaagtatgagcccgagtctagatccagcgaagacaaatctagtttgtttgaggatgaa 

gatagacaacoatgcagtgagcattgttacctcaaggtcaggagtgtgacagaagctgatcatgtgatggataatgataactctatatca 

aacaagattgtggtctcagatccaaacaacactatgtggacgcctgtagagaaggatctttacttgaaaggaattgagatatttgggaga 

aa'cagttgtgatgttgcattaaacatacttcgggggcttaagacgtgcctagagatttacaattacatgcgcgaacaagatcaatgtact 

atgtcattagaccttaacaaaactacacaaagacacaatcaggttaccaaaaaagtatctcgaaaaagtagtaggtcggtccgcaaaaaa 

tcgagactocgaaaatatgctcgttatccgcotgctttaaagaaaacaactagtggagaagctaagttttataagcactacacaccatgc 

acttgcaagtcaaaatgtggacagcaatgcccttgtttaactcacgaaaattgctgcgagaaatattgcgggtgctcaaaggattgcaac 

aatogctttggaggatgtaattgtgcaattggccaatgcacaaatcgacaatgtccttgttttgctgctaatcgtgaatgcgatccagat 

ctttgtcggagttgtcctcttagctgtggagatggcactcttggtgagacaccagtgcaaatccaatgcaagaacatgcaattcctcctt 

caaaccaataaaaagattctcattggaaagtctgatgttcatggatggggtgcatttacatgggactctcttaaaaagaatgagtatctc 

ggagaatatactggagaaotgatoactcatgatgaagctaatgagcgtgggagaatagaagatcggattggttcttcctacctctttacc 

ttgaatgatcagctcgaaatcgatgctcgccgtaaaggaaacgagttcaaatttctcaatcactcagcaagacctaactgctacgccaag 

ttgatgattgtgagaggagatcagaggattggtctatttgcggagagagcaatcgaagaaggtgaggagcttttcttcgactactgctat 

ggaccagaacatgcggattggtcgcgtggtcgagaacctagaaagactggtgcttctaaaaggtctaaggaagcccgtccagctcgtgga 

tcccccggaggtggaggtggaggtggaggtggaggtggaggtggaggtggaatgaggacacatatagagggtcatgaagagctggatgca 

gcaagtttgcgtgaacatgaagagttcacgaaagtgaagaacatatcaacaattgagcttggaaaatatgagattgagacttggtacttc 

tccccttttccgccagaatacaatgactgtgtgaagctctttttttgtgagttttgcctgaacttcatgaaacgcaaagagcagottcaa 

aggcatatgagraagtgtgacctgaagcacccacctggtgatgaaatttaccgaagtggtaccttgtcaatgtttgaggtagatggcaaa 

aagaacaaggtttatgcacagaatctctgctacctggcaaagttatttcttgaccacaaaactctttactacgatgttgatttgtttcta 

ttctacgttctttgcgaatgtgatgaccgaggatgccacatggttgggtacttttcaaaggagaagcattcggaagaagcatacaactta 

gcttgcattctaaccctgccttcatatcaaagaaaaggctatggaaagttcttaatagccttttoctatgaactgtcaaagaaagaggga 

aaagttgggacaccggraaagacccttgtoggatctaggcttactaagctacagaggttattggactcgtgttotattagaaatcttgaa 

aaaacataa 
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Table 13 
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OTHER EMBODIMENTS 

It is to be imderstood that while the invention has been described in conjunction 
with the detailed description thereof, the foregoing description is intended to illustrate and 
not limit llie scope of the invention, which is defined by the scope of the appended claims. 
Other aspects, advantages, and modifications are within the scope of the following 
claims. 



69 



wo 03/000715 



PCT/US02/19750 



What is claimed is: 



10 



A chimeric polypeptide comprismg: 

a. a first polypeptide segment that exhibits histone acetyltransferase activity; and 

b. a second polypeptide segment, wherein said second polypeptide segment has 
40% or greater sequence identity to a subunit of a histone deacetylase 
chromatin-associated protein complex, 

wherein a terminus of said second polypeptide segment is linked to a terminus of 
said first polypeptide segment via at least one covalent bond. 

2. The chimeric polypeptide of claim 1, wherem said second polypeptide has 60% or 
greater sequence identity to a subunit of a histone deacetylase chromatin- 
associated protein complex. 

15 3. The chimeric polypeptide of clahn 1, AAdierein said second polypeptide has 80% or 
greater sequence identity to a subunit of a histone deacetylase chromatin- 
associated protein complex. 

4. The chimeric polypeptide of claim 1 , wherein said second polypeptide has 90% or 
20 greater sequence identity to a subunit of a histone deacetylase chromatin- 
associated protein complex. 

5. The chimeric polypeptide of claim 1 , wherein said submiit exhibits scaffold 
activity, 

25 

6. The chimeric polypeptide of claim 1 , wherein said subunit exhibits DNA binding 
activity. 

7. The chimeric polypeptide of claim 1 , wherein said subunit exhibits ATPase- 
30 dependent heliciase activity. 
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8. The chimeric polypeptide of claim 1, wherein said subunit exhibits histone 
deacetylase activity. 

9. The chuneric polypeptide of claim 1 , wherein said first and said second 
polypeptide segmraits are directly linked via a peptide bond. 

10. The chimeric polypeptide of claim 9, wherein the C-terminal amino acid of said 
first polypeptide segment is linked to the N-tetminal amino acid of said second 
polypeptide segment. 

11. The chimeric polypeptide of claim 9, wherem the N-terminal ammo acid of said 
first polypeptide segment is Imked to the C-terminal amino acid of said second 
polypeptide segment. 

15 12. The chimeric polypeptide of claun 1, wherein said first and said second 

polypeptide segments are mdnectly linked via one or more intervening amino 
acids that are situated between said first and said second polypeptide segments. 

13. The chimeric polypeptide of claim 12, wheiemthe C-terminal amino acid of said 
20 first polypeptide segment is Unked to one of said one or more intervening amino 

acids, and wherein the N-terminal ammo acid of said second polypeptide segment 
is linked to one of said one or more intervenmg amino acids. 

14. The chuneric polypeptide of claim 12, wherein the N-terminal amino acid of said 
25 first polypeptide segment is linked to one of said one or more intervening amino 

acids, and wherein the C-terminal amino acid of said second polypeptide segment 
is linked to one of said one or more intervenmg amino acids. 

15. The chuneric polypeptide of claim 12, wherein said first and said second 

30 polypeptide segments are indirectly Imked via 1 to 50 intervening amino acids. 
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16. The chimeric polypeptide of claim 1 5, wherein said jBrst and said second 
polypeptide segments are indirectly linked via 10 to 50 intervening amino acids. 

1 7. The chimeric polypeptide of claim 1 5, wherein said intervening amino acids 
6 comprise at least one alanine residue. 

18. The chimeric polypeptide of claim 1 5, wherein said intervening amino acids 
comprise at least one glycine residue. 

10 19. A nucleic acid construct encoding the polypeptide of claim 1 . 

20. A eukaryotic organism comprising the chimeric polypeptide of claim 1 . 

21 . A eukaryotic organism comprising a nucleic acid encoding a chimeric 
16 polypeptide, said chimeric polypeptide comprismg: 

a. a first polypeptide segment that exhibits histone acetyltransferase activity; and 

b. a second polypeptide segment, wherein said second polypeptide segment has 
40% or greater sequence identity to a subunit of a histone deacetylase 
chromatin-associated protein complex, 

20 wherein a terminus of said second polypeptide segment is covalently linked to a 

terminus of said first polypeptide segment. 



22. A eukaryotic organism comprising: 

a. a furst nucleic acid construct comprising a first promoter and a 
25 transcription activator element operably linked to a coding sequence, said 

coding sequence encoding: 

i) a first polypeptide segment that exhibits histone acetyltansferase 
activity; and 

ii) a second polypeptide segment, wherein said second polypeptide 
30 . segment has 40% or greater sequence identity to a subunit of a 

histone deacetylase chromatin-associated protein complex. 
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wherein a tenninus of said second polypeptide segment is covalently 
linked to a terminus of said first polypeptide segment; and 
b. a second nucleic acid construct comprising a second promoter confetrii^ 
cell type-specific transcription, said second promoter operably linked to a 
5 coding sequence for a polypeptide that binds said transcription activator 

element 

23. The eukaryotic organism of claim 22, wherein said eukaryotic organism is an 
animal. 

10 

24. The eukaryotic organism of claim 22, wherein said eukaryotic organism is a plant. 

The eukaryotic organism of claim 24, wherein said plant is a monocot. 

The eukaryotic organism of claim 25, wherein said monocot is selected firom the 
group consisting of com and rice. 

The eukaryotic organism of claim 24, wherem said plant is a dicot. 

The eukaryotic organism of claim 27, wherein said dicot is selected firom the 
group consisting of soybean and rape. 

The eukaryotic organism of claim 24, wherein said plant comprises an agent or 
mutation that alters the DNA methylation state in said plant relative to a 
corresponding plant that lacks said agent or mutation. 

The eukaryotic organism of claim 29, wherein said DNA methylation state is 
decreased relative to a correspondmg organism that lacks said agent or mutation. 

The eukaryotic organism of ckiim 30, wherein said mutation is in a C5 DNA 
methyltransferase gene. 
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32. The eukaryotic organism of claim 30, wherein said agent affects expression of a 
C5 DNA methyltransferase gene. 

5 33. The eukaryotic organism of claim 32, wherein said agent is an antisense nucleic 
acid. 

34. A method for detecting the expression of one or more genes m a eukaryote, said 
method comprising: 

1 0 a. isolatmg macromolecules from one or more specific cells m said 

exikaryote, said eukaryote comprising a nucleic acid construct that has a 
promoter operably Imked to a coding sequence that encodes: 
i) a first polypeptide segment that exhibits histone acetyltransferase 
activity; and 

^5 ii) a second polypeptide segment, wherein said second polypeptide 

segment has 40% or greater sequence identity to a subunit of a 
histone deacetylase chromatin-associated protein complex, 
wherein a terminus of said second polypeptide segment is covalently 
linked to a termmus of said first polypeptide segment; and 
20 b. determining the presence or amount of at least one of said macromolecules 

in at least one of said specific cells. 

35. The method of claim 34, wherein said macromolecules are polypeptides. 
25 36. The method of claim 34, wherein said macromolecules are nucleic acids. 

37. The method of claim 34, wherein said eukaryote is an animal. 

38. The method of claim 34, wherein said eukaryote is a plant. 

30 
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39. The method of claim 38, wherein said promoter confers cell-type specific 
transcription in a plant reproductive tissue. 

40. The method of claim 39, wherein said reproductive tissue is a flower. 

41. The method of claim 40, wherein said reproductive tissue is selected from the 
group consisting of: ovule, and central cell. 

42. The method of claim 39, wherein said reproductive tissue is a seed. 

43. The method of claim 42, wherein said reproductive tissue is selected from the 
group consisting of: endosperm, embryo, and zygote. 

44. The method of claim 38, wherein said promoter confers cell-type specific 
1 5 transcription in a plant vegetative tissue. 

45. A method for detecting the expression of one or more genes in a eukaryote, said 
method comprising: 

a, isolating macromolecules from one or more specific cells in said 
20 eukaryote, said eukaryote comprising: 

i) a first nucleic acid construct having a first promoter and a 
transcription activator element operably Imked to a coding 
sequence, said coding sequence encoding: 

a) a first polypeptide segment that exhibits histone 
25 acetyltransferase activity; and 

b) a second polypeptide segment, wherein said second 
polypeptide segment has 40% or greater sequence identity 
to a subumt of a histone deacetylase chromatin-associated 
protein complex, 
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wherein a terminus of said second polypeptide segment is 
covalently linked to a terminus of said first polypeptide segment; 
and 

ii) a second nucleic acid construct comprising a second promoter 
conferring cell type-specific transcription, said second promoter 
operably linked to a coding sequence for a polypeptide that binds 
said transcription activator element; and 
b. determining the presence or amoimt of at least one of said macromolecules 

in at least one of smd specific cells. 
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46. The method of claim 45, wherein said macromolecules are polypeptides. 

47. The method of claim 45, wherein said macromolecules are nxicleic acids. 
15 48. The method of claim 45, wherein said eukaryote is an animal. 

49. The method of claim 45, wherein said eukaryote is a plant. 

50. The method of claim 49, wherein said second promoter confers cell-type specific 
20 transcription in a plant reproductive tissue. 

51. The method of claim 50, wherein said reproductive tissue is a flower. 

52. The method of claim 5 1 , wherein said reproductive tissue is selected fi-om tiie 
25 gi^oup consisting of: ovule, and central cell. 

53. The method of claun 50, wherein said reproductive tissue is a seed. 

54. The method of claim 53, wherein said reproductive tissue is selected fi:om the 
30 group consisting of: endosperm, embryo, and zygote. 
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55. The method of claim 49, wherein said second promoter confers cell-type specific 
transcription in a plant vegetative tissue. 

56. A method for modulating gene expression in a eukaryote, said method comprising 
5 making a eukaryote having a nucleic acid construct comprising a cell-type specific 

promoter operably linked to a coding sequence that encodes: 

a. a first polypeptide segment that exhibits histone acetyltransferase activity; 
and 

b. a second polypeptide segment, wherein said second polypeptide segment 
^ 0 has 40% or greater sequence identity to a subunit of a histone deacetylase 

chromatin-associated protein complex, 
wherein a terminus of said second polypeptide segment is co valently Imked to a 
terminus of said first polypeptide segment; and 

wherein said eukaryote exhibits modulated gene expression in cells in which said 
16 promoter confers cell-type specific transcription. 

57. The mettiod of claim 56, wherein said eukaryote has compositional alterations 
relative to a corresponding organism that lacks said nucleic acid construct. 

20 58, The method of claim 56, wherein said eukaryote has developmental alterations 
relative to a corresponding organism that lacks said nucleic acid construct. 

59. The method of claim 56, wherem said eukaryote has phenotypic alterations 
relative to a corresponding organism that lacks said nucleic acid construct, 

25 

60. The method of claim 56, wherein said eukaryote is an animal. 

61 . The method of claim 56, wherein said eukaryote is a plant 

30 62. The method of claim 61, wherein said promoter confers cell-type specific 
transcription in a plant reproductive tissue. 
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63 . The method of claim 62, wherein said reproductive tissue is a flower. 

64. The method of claim 63, wherein said reproductive tissue is selected from the 
5 group consisting of: ovule, and central cell. 

65. The method of claim 62, wherein said reproductive tissue is a seed. 

66. The method of claim 65, wherein said reproductive tissue is selected from the 
10 group consisting of: endospran, embryo, and 2ygote. 

67. The method of claim 61, wherein said promoter confers cell-type specific 
transcription in a plant vegetative tissue. 

15 68. The method of claim 61, wherein said plant comprises an agent or mutation that 
alters the DNA methylation state in said plant relative to a corresponding plant 
that lacks said agent or mutation, 

69. The method of claim 68, wherem said DNA methylation state is decreased relative 
20 to a correspondmg plant that lacks said agent or mutation. 

70. The method of claim 69, wherem said mutation is in a C5 DNA methyltransferase 
gene. 

25 71 . The method of claim 69, wherein said agent affects expression of a C5 DNA 
methyltransferase gene. 

72. The method of claim 71, wherein said ^ent is an antisense nucleic acid. 

30 73. The method of claim 61 , wherein said modulated gene expression alters seed 
development 
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74. The method of claim 61, wherein said modulated gene expression alters embryo 
development. 

75. The method of claim 61, wherem said modulated gene expression alters 
endosperm development. 

76. The method of claim 61, wherem said modulated gene expression alters seed yield 
by mass. 
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77. A method for modulating gene expression in a eukaryote, said method comprising 
making a eukaryote, said eukaryote compriskig: 

a. a first nucleic acid construct having a first promoter and a transcription 
activator element operably linked to a coding sequence, said coding 
1 5 sequence encoding: 

i) a first polypeptide segment that exhibits histone acelyltransferase 

activity; and 

ii) a second polypeptide segment, \;\dierem said second polypeptide 
segment has 40% or greater sequence identity to a subunit of a 

20 histone deacetylase chromatin-associated protein complex, 

wherein a terminus of said second polypeptide segment is covalentiy 
linked to a terminus of said first polypeptide segment; and 
b) a second nucleic acid construct comprismg a second promoter conferring 
cell type-specific transcription, said second promoter operably linked to a 

25 coding sequence for a polypeptide that binds said transcription activator 

element, 

wherein said eukaryote exhibits modulated gene expression in cells in which said 
second promoter confers cell-type specific expression. 

30 78. The method of claim 77, wherein said eukaryote has compositional alterations 
relative to a corresponding organism that lacks said nucleic acid construct. 
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79. The method of claim 77, wherein said eukaryote has developmental alterations 
relative to a corresponding organism that lacks said nucleic acid construct. 

5 80. The method of claim 77, wherein said eukaryote has phenotypic alterations 
relative to a corresponding organism that lacks said nucleic acid construct 

81. The method of claim 77, wherein said eukaryote is an animal. 

10 82. The method of claim 77, wherein said eukaryote is a plant. 

83. The method of claim 82, wherein said promoter confers cell-type specific 
transcription in a plant reproductive tissue. 

15 84. The method of claim 83, wherein said reproductive tissue is a flower. 

85. The method of claim 84, wherem said reproductive tissue is selected firom the 
group consisting of: ovule, and central cell. 

20 86. The method of claim 83, wherein said reproductive tissue is a seed. 

87. The method of claim 86, wherein said reproductive tissue is selected from the 
group consisting of: endosperm, embryo, and zygote. 

25 88. The method of claim 82, wherein said promoter confers cell-type specific 
transcription in a plant vegetative tissue. 

89. The method of claim 82, wherein said plant comprises an agent or mutation that 
alters the DNA methylation state m said plant relative to a corresponding plant 
30 that lacks said agent or mutation. . 
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90. The method of claim 89, wherein said DNA methylation state is decreased relative 
to a corresponding plant that lacks said agent or mutation. 

91. The method of claim 90, wherein said mutation is m a C5 DNA methyltransferase 
5 gene. 

92. The method of claim 90, wherein said agent affects expression of a C5 DNA 
methyltransferase gene. 

10 93. The method of claim 92, wherein said agent is an antisense nucleic acid. 

94. The method of claim 82, wherem said modulated gene expression alters seed 
development. 

16 95. The method of claim 82, v^erein said modulated gene expression alters embryo 
development 

96. The method of claim 82, wherem said modulated gene expression alters 
endosperm development. 

20 

97. The method of claim 82, wherem said modulated gene expression alters seed yield 
by mass. 

98 . A method for makmg a genetically modified eukaiyote, said method comprising: 
25 a. providmg a first eukaryote comprising a first nucleic acid construct, said 

first nucleic acid construct comprising a first promoter and a transcription 
activator element operably linked to a coduig sequence, said coding 
sequence encoding: 

i) a first polypeptide segment that exhibits histone acetyltransferase 
30 activity; and 
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ii) a second polypeptide segment, wherein said second polypeptide 
segment has 40% or greater sequence identity to a subunit of a 
histone deacetylase chromatin-associated protem complex, 
wherein a terminus of said second polypeptide segment is covalently 
linked to a teiminus of said first polypeptide segment; and 

b. providing a second eukaryote comprising a second nucleic acid construct, 
said second nucleic acid construct comprising a second promoter 
conferring embryo-specific transcription, said second promoter operably 
luiked to a coding sequence for a polypeptide that binds said transcription 
activator element, 

c. crossing said first eukaryote and said second eukaryote to form genetically 
modified progeny that are strale. 

99. The method of claun 98, wherein said eukaryote is an animal. 

100. The method of claim 98, wherein said eukaryote is a plant 
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